fraganator / archive-cache-manager

A LaunchBox plugin which extracts and caches large ROM archives, letting you play games faster.
GNU Lesser General Public License v2.1
11 stars 5 forks source link

Path too long #4

Closed fraganator closed 1 year ago

fraganator commented 3 years ago

In some cases the cache path is too long. For example no-intro has Quackshot named as QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).zip. When extracted to the cache, the resulting path can exceed 260 characters, and RetroArch is unable to play it: C:\LaunchBox\ArchiveCache\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).zip - 227F83C91E9941A6C9EC4A456B49B005\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).md

Possible solutions:

nixxou commented 2 years ago

I do not deep dive enought in your code, so it's just a personal opinion :

I would simply use the MD5. Instead of C:\LaunchBox\ArchiveCache\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).zip - 227F83C91E9941A6C9EC4A456B49B005\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).md just use that : C:\LaunchBox\ArchiveCache\227F83C91E9941A6C9EC4A456B49B005\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).md

I would then scan the whole archive and if, even using only the MD5 as dir, one of the extracted path go over 260 char, i would rename the file with their CRC32 value (that you can get directly get from the 7z l listing, no need to compute). Why only when 260 char ? Because you often launch game using priority match, and if that's the case, it's interesting to have the original rom name show up on the retroarch emulator launch. So i would keep it as an exception, not a general rule (and that could mess with game that have files dependencies)

So let's say that "C:\LaunchBox\ArchiveCache\227F83C91E9941A6C9EC4A456B49B005\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).md" go over 260 char (it's not the case here, but just for the sake of the exemple) and the file CRC32 is 1801098B, i would store it as : "C:\LaunchBox\ArchiveCache\227F83C91E9941A6C9EC4A456B49B005\1801098B.md"

I would also store a 227F83C91E9941A6C9EC4A456B49B005.acm-listing.json that contains the full list of files/path and their renamed counterpart. btw it would be a performance improvement over using the 7z l each time and that could be useful from some other features. I don't know if that's technically relevant, but you can use the library shipped with launchbox to work on json (Newtonsoft.Json 12.0.3)

As a side note, i wonder if there is an issue with the MD5 of the path, like "Doom (USA).zip that go like 7309402b2dbee883f0f83e3e962dff24. If you have a Sonic.7z in both megadrive and master system, it will have the same md5 ? I don't know if i'm right, i have not studied the code enough, but if that the case, maybe it should be better to include other elements in the md5 calculation (like md5(name+filesize))

fraganator commented 2 years ago

Thanks for the comments, I've added a few responses below. There's a lot of dev work currently going on the multi-disc branch of the repo, and as part of that some planning for shortening the path.

I would simply use the MD5. Instead of C:\LaunchBox\ArchiveCache\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).zip - 227F83C91E9941A6C9EC4A456B49B005\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).md just use that : C:\LaunchBox\ArchiveCache\227F83C91E9941A6C9EC4A456B49B005\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).md

This is my thinking at the moment. Either that, or using the Game ID to tie it back to LaunchBox. The Game ID is now included in the game.ini file on the multi-disc branch, so can be used as the cache folder name.

There's also provision in the current code for using the first n chars of the MD5 (currently it uses all 32), so the path could be shortened a little (say using only 6 chars - QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).zip - 227F83). But I think MD5 / Game ID is better overall.

Earlier versions of the plugin didn't have a config window which listed the cached items, so it was useful to include the game name in the cache path to see what was cached when looking in File Explorer. With the cache listing in config window, the cache path name is less important.

I would then scan the whole archive and if, even using only the MD5 as dir, one of the extracted path go over 260 char, i would rename the file with their CRC32 value (that you can get directly get from the 7z l listing, no need to compute). Why only when 260 char ? Because you often launch game using priority match, and if that's the case, it's interesting to have the original rom name show up on the retroarch emulator launch. So i would keep it as an exception, not a general rule (and that could mess with game that have files dependencies)

So let's say that "C:\LaunchBox\ArchiveCache\227F83C91E9941A6C9EC4A456B49B005\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).md" go over 260 char (it's not the case here, but just for the sake of the exemple) and the file CRC32 is 1801098B, i would store it as : "C:\LaunchBox\ArchiveCache\227F83C91E9941A6C9EC4A456B49B005\1801098B.md"

Rather than renaming files, another option might be to create symlinks which are shortened versions of the original filename (so the symlink path is <260 chars), and have the symlink point to the original file. The symlink would then be passed back to LaunchBox and on to the emulator. Hopefully that wouldn't mess with cue sheets.

As a side note, i wonder if there is an issue with the MD5 of the path, like "Doom (USA).zip that go like 7309402b2dbee883f0f83e3e962dff24. If you have a Sonic.7z in both megadrive and master system, it will have the same md5 ? I don't know if i'm right, i have not studied the code enough, but if that the case, maybe it should be better to include other elements in the md5 calculation (like md5(name+filesize))

This particular use case was the original reason the MD5 hash was included in the cache path, so games with the same filename from different systems didn't overwrite one another. The MD5 is based on the absolute path, so should be unique.

fraganator commented 2 years ago

Been thinking about this a little more. If the md5 is shortened it increases the chance of a collision, but when combined with the filename it shouldn't be a concern. If the path is still too long, look at inserting an ellipsis in the middle of the path and reducing the length to fit. Ellipsis would still retain minimum first 8 chars + extension + shortened md5.

This retains readability while browsing the cache folder (when manually selecting a cached game or swapping discs within an emulator).

The table below lays out the options, with with the original 295 char path of C:\Games\Emulation\Frontends\LaunchBox\ArchiveCache\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).zip - 227F83C91E9941A6C9EC4A456B49B005\QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).md

Shorten method Example Path Length Example Path Snippet
None 295 QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).zip - 227F83C91E9941A6C9EC4A456B49B005
Reduce md5 to 6 chars 269 QuackShot Starring Donald Duck ~ QuackShot - I Love Donald Duck - Guruzia Ou no Hihou (World) (v1.1).zip - 227F83
Insert ellipsis in middle of folder name if total path > 260 chars 259 QuackShot Starring Donald Duck ~ QuackShot - I Love...- Guruzia Ou no Hihou (World) (v1.1).zip - 227F83