Open oevesque opened 1 month ago
When the input isn't UTF-8, our only option is to guess at the encoding by just trying a whole bunch of them and going with whatever encoding first succeeds (which may or may not be the intended one). Based on your above sample, I suppose I could look into how python's chardet.detect
works and see if we can mimic that behaviour.
Related to #105, ideally we should allow the user to change the encoding but then we'd need to either 1) save it, which presumably wouldn't work in your case anyway if files are read-only (although I'd venture that many things are likely to not work very well if the required files are read-only) or 2) keep an internal database of "what is the encoding for this lyric file?", which is more work but probably also "more correct". I've not yet tried your particular file but my guess is just that it happens to be valid with codepage 936/949/950 (which are chinese & korean and tried before 1252 because they get tried in roughly sequential order). We do already try the "system default codepage" so if your machine is set to french then I would expect it to try that first but I also don't have any idea how it decides what the "system default codepage" is, so maybe not.
Honestly the actual re-encoding itself isn't particularly complicated, the main reason I've not done it yet is that it's a bit of a pain to do the UI properly.
I should add though, that we do output debug logs about the decoding so if you enable debug logs in preferences, you should see some info about what encodings were tried and which one was ultimately used.
Thx for you reply. Output log with debug activated: INFO-OpenLyrics: Lookup local-file file://D:\Music\Mp3 Francais\Georges Brassens - Anthologie\13 - La Non-Demande En Mariage.txt for lyrics... INFO-OpenLyrics: Successfully retrieved lyrics from file://D:\Music\Mp3 Francais\Georges Brassens - Anthologie\13 - La Non-Demande En Mariage.txt INFO-OpenLyrics: Successfully looked-up lyrics from source: Local files INFO-OpenLyrics: Parsing lyrics text... INFO-OpenLyrics: Successfully converted 1713 bytes of UTF-16 into UTF-8 INFO-OpenLyrics: Parsing LRC lyric text... INFO-OpenLyrics: Lyric loading complete INFO-OpenLyrics: Skipping lyric save. Type: 1, Local: yes, Timestamped: no, Autosave: 1
if the txt file is in 'windows-1252', 'iso-8859-1' format, the lyrics are badly show as chinese characters in 1 line.
Steps to reproduce
Expected behavior
show multiple lines in french
Versions
Debug logs
no error on debug logs
Additional information
A small python script to force my lyrics to UTF-8, but some files are read-only and I can't convert them.