MTG / acousticbrainz-client

A client to upload data to an acousticbrainz server
GNU General Public License v3.0
29 stars 22 forks source link

Fix Python 3 encoding for error messages and `VERBOSE` output #28

Open JonnyJD opened 9 years ago

JonnyJD commented 9 years ago

The extractor always returns byte strings, which is fine for Python 2. Python 3 print() expects unicode strings, rather than byte strings. So if str == byte (Python 2) we do nothing. Otherwise we decode the msg if it isn't already a unicode string.

There was a different problem when VERBOSE was activated. An the byte string coming from the extractor was encoded again (which is fine as long as it is ascii on Python 2), but as above, using print() with a byte string on Python 3 is wrong.

The result on Python 3 with errors without this fix is this:

[:( unk 5  ] /var/data/music/mp3-db/In Strict Confidence/2002: Mistrust the Angels/09 - In Strict Confidence - It Seems Lost....mp3
b'Process step: Read metadata\nProcess step: Compute md5 audio hash and codec\n\x1b[0;33m[ WARNING  ] \x1b[0mAudioLoader: invalid frame, skipping it: Invalid data found when processing input\nProcess step: Replay gain\n\x1b[0;33m[ WARNING  ] \x1b[0mAudioLoader: invalid frame, skipping it: Invalid data found when processing input\n\x1b[0;33m[ WARNING  ] \x1b[0mAudioLoader: invalid frame, skipping it: Invalid data found when processing input\nERROR: File looks like a completely silent file... Aborting...\n'

I also added a compat.encode(), which was used until I found out even the previous encode() was misplaced (see above). I kept it in the code unused. It might help out later on.

Note that I have "utf-8" and "ascii" as encodings in compat.py since that was in the code I was replacing. This should probably replaced with sys.stdout/stdin.encoding. I left in comments about that.

Getting Unicode output to work on Windows is a different beast. For the problem and the solution have a look in JonnyJD/isrcsubmit#40 and/or 8f931940ec9b45e2ee72db0bc19e97a9f25bf6f2. Short: You have to make a wrapper to change the codepage to cp65001 and tell Python that this is "~ utf-8".

alastair commented 9 years ago

Can you look at adding the stdout/stdin encoding settings. As it is, I'm unlikely to merge this until I have enough time to sit down and work on it myself.

mineo commented 9 years ago

Any progress on this? I'm now running abzsubmit on some files again and reading multiline error messages in one line is no fun :(

alastair commented 9 years ago

Sorry, no movement on this. We have stopped effort for now on the submission tools until we get further on the data/backend, and probably won't address them again until we have another extractor ready (no timeframe for this). As I said in my previous comment, if the TODOs in this patch were addressed, I'd merge it.