mlsmithjr / transcoder

Python wrapper for ffmpeg for batch and/or concurrent transcoding
GNU General Public License v3.0
117 stars 25 forks source link

more graceful non-UTF8 ffmpeg output handling #17

Closed slippycheeze closed 3 years ago

slippycheeze commented 3 years ago

If the output of ffmpeg -- which operates on bytes -- contains invalid UTF-8 sequences pytranscoder currently fails with a UnicodeDecodeError.

This change sets errors='replace' for all ffmpeg subprocesses. This will cause the invalid byte to be replaced with the unicode "replacement character", the designated placeholder for this sort of thing.

That'll theoretically "corrupt" the output, but in practice it won't break anything in a way that matters, and is much simpler (and smarter) than trying to work with the data as raw bytes, or an 8-bit encoding.

slippycheeze commented 3 years ago

This needs more updates for the changes in the code; I'll resubmit shortly.