marl / pysox

Python wrapper around sox.
BSD 3-Clause "New" or "Revised" License
513 stars 79 forks source link

utf8 decoding failed on file_info.comments #128

Open nefastosaturo opened 3 years ago

nefastosaturo commented 3 years ago

Hello there,

I got this FLAC audio that displays these information when I run soxi -a audiofile.flac :

$ soxi -a 1989_7240_000022.flac 
artist=Giacomo Leopardi
DESCRIPTION=https://archive.org/details/
genre=Speech
title=13 - La sera del d� di festa
album=Canti
TRACKNUMBER=13
encoder=Lavf57.83.100

When I use sox.file_info.comments() I got an UnicodeDecodeError at

shell_output = shell_output.decode("utf-8")

in core.py, line 155

Right now I just catch that errror and call soxi as a subprocess and then decode it with

.decode("utf-8", "ignore")

but also a "replace" flag could be useful and can be given to the shell_output.decode line

https://github.com/rabitt/pysox/blob/7e0891a40ad4e29e2a67e6abf826cd048377231c/sox/core.py#L162

rabitt commented 3 years ago

@nefastosaturo thanks for catching this! If you'd like, feel free to open a PR with the fix you propose.