If the output of ffmpeg -- which operates on bytes -- contains invalid UTF-8 sequences pytranscoder currently fails with a UnicodeDecodeError.
This change sets errors='replace' for all ffmpeg subprocesses. This will cause the invalid byte to be replaced with the unicode "replacement character", the designated placeholder for this sort of thing.
That'll theoretically "corrupt" the output, but in practice it won't break anything in a way that matters, and is much simpler (and smarter) than trying to work with the data as raw bytes, or an 8-bit encoding.
If the output of ffmpeg -- which operates on bytes -- contains invalid UTF-8 sequences pytranscoder currently fails with a
UnicodeDecodeError
.This change sets
errors='replace'
for all ffmpeg subprocesses. This will cause the invalid byte to be replaced with the unicode "replacement character", the designated placeholder for this sort of thing.That'll theoretically "corrupt" the output, but in practice it won't break anything in a way that matters, and is much simpler (and smarter) than trying to work with the data as raw bytes, or an 8-bit encoding.