dipzza / ultrastar-song2txt

Tools that automate parts of making a song in the ultrastar txt format
GNU Affero General Public License v3.0
1 stars 0 forks source link

txt_parser: ñ characters become ń #33

Closed dipzza closed 2 years ago

dipzza commented 2 years ago

Regarding txt_parser module which is used for #7 and could be used for #8.

To detect a valid encoding for reading txt files charset_normalizer i used right now.

However, in spanish txt files sometimes the encoding detected transforms 'ñ' to 'ń'. This may be because it looks for any encoding in which characters from the file are readable.

dipzza commented 2 years ago

To know which encodings should be supported:

Vocaluxe supports windows-1250, windows-1252 and utf-8: https://github.com/lukeIam/Vocaluxe/tree/travis.

The original USDX supports windows-1252, utf-8 and seems to have supported windows-1250 in the past: https://github.com/UltraStar-Deluxe/USDX/issues/283

They ask users to change file encodings to UTF-8 https://github.com/UltraStar-Deluxe/USDX/issues/347