libretro / RetroArch

Cross-platform, sophisticated frontend for the libretro API. Licensed GPLv3.
http://www.libretro.com
GNU General Public License v3.0
10.37k stars 1.84k forks source link

Thumbnail updater freeze with accented characters in playlist database #9593

Open ttmcmurry opened 5 years ago

ttmcmurry commented 5 years ago

[Description of the bug]

When updating thumbnails for games identified in the playlist, it stops processing (but doesn't end) when it encounters a playlist entry with an accented character.

[What you expected to happen]

For the updater to process all items in the playlist regardless of presence of accented characters, which the end-user doesn't have much control over, considering the name as stored in the PlayStation database.

[What is actually happening]

Once the filename is encountered, thumbnail update scanning does not continue, but does not end, either. I left it running for 2 hours and no progress was made. I've re-run the process in a new Retroarch instance, and it gets stuck on the same title.

retroarch - 2019-09-08 - updater problem

[Steps to reproduce the bug]

1) Run the playlist scanner on a folder - in my case a PlayStation folder containing Einhander, GameID SCUS94243. In the source folder, there are no accented characters in the filename: Einhander (USA) (SCUS94243).chd

2) Run the Online Updater -> Playlist Thumbnails Updater

3) When the updater gets to Einhander, the playlist has the entry as "Einhänder (USA)" and the updater stops working

4) Remove the entry from the playlilst and the updater continues as expected, and continues to the end. Noting, there are no other PSX images with accented characters in the playlist.

Version/Commit RetroArch: Sep 8 2019 - 2179c16f60

Environment information: OS Win10 1903 18362.418 / MinGW 7.3.0 64-bit

Debug Uploaded to Server #37961

RobLoach commented 5 years ago

Does Windows not suppose these characters in their file system? What's wrong with Windows.

i30817 commented 5 years ago

They probably support them just fine, it's just that windows uses utf-16.

https://stackoverflow.com/questions/2050973/what-encoding-are-filenames-in-ntfs-stored-as

But if that was the only problem it shouldn't be freezing anyway, just displaying those funny numeric box characters, so there is something else on how your system uses the name that hates the multibyte on a loop (maybe) or waiting for a notification from a api that isn't using the utf-16 api with a utf-16 filename and failing to find the file (likely).

ttmcmurry commented 5 years ago

While I appreciate the posts about UTF-16 characters in filesystem names, this is not a filesystem issue. The issue stems from inside Retroarch. In Retroarch's game database, if you examine the matching hash entry for Einhander, the DB has an ä character in the game's title. That's the source of the data and the source of the problem. The symptom is the scanner stops once it encounters the game's name data in the database.

This could very well be an encoding issue, but it isn't stemming from the file system. It is stemming from Retroarch's game database and how it is being parsed/queried, then saved to the playlist.

For example, after scanning and freezing on Einhander where Retroarch matches it in the DB as Einhänder. The resulting entry is not written to "Sony - PlayStation.lpl" . I've included a sample entry from that playlist. Hypothetically if this worked as designed, the path value would be the non-accented ASCII path & filename. The label value, if it were written as specified in Retroarch's Game DB, would contain the accented character.

The "Sony - PlayStation.lpl" file, on a Windows system as written to/created by Retroarch is encoded in UTF-8 (so says Notepad++). Again, if it's an encoding issue .. UTF-16 data attempted to be written in a UTF-8 file then sure I can understand the nature of the issue. I don't know how the source data is encoded/parsed/handled within Retroarch's "Scan File/Folder" function, so I cannot speculate the extent of the problem or if encoding is the issue at all.

Sample playlist entry: "path": "X:\Library - Playstation\ECW Hardcore Revolution (USA) (SLUS01045).chd", "label": "ECW - Hardcore Revolution (USA)", "core_path": "DETECT", "core_name": "DETECT", "crc32": "00000000|crc", "db_name": "Sony - PlayStation.lpl"