Unpackerr / unpackerr

Extracts downloads for Radarr, Sonarr, Lidarr, Readarr, and/or a Watch folder - Deletes extracted files after import
https://unpackerr.zip
MIT License
1.03k stars 36 forks source link

Polish support for ISO9660 file format #264

Open blackwind opened 1 year ago

blackwind commented 1 year ago
$ ls -l extracted-by-unpackerr/
total 4432364
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 ARTBOOK
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 GUIDE
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 MANUAL
drwxr-xr-x 4 docker everyone       4096 2023-02-01 21:43 OST
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:44 POSTER
-rw-r--r-- 1 docker everyone  243729046 2023-02-01 21:44 SETUP~01.BIN
-rw-r--r-- 1 docker everyone 4294081022 2023-02-01 21:44 SETUP_X-.BIN
-rw-r--r-- 1 docker everyone     896112 2023-02-01 21:44 SETUP_X-.EXE
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:44 WALLPAPE

$ ls -l extracted-by-winrar/
total 4432364
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 artbook
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 guide
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 manual
drwxr-xr-x 4 docker everyone       4096 2022-12-16 08:20 ost
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 poster
-rw-r--r-- 1 docker everyone 4294081022 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327)-1.bin
-rw-r--r-- 1 docker everyone  243729046 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327)-2.bin
-rw-r--r-- 1 docker everyone     896112 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327).exe
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 wallpapers
davidnewhall commented 1 year ago

Thank you! I have a few points to make, but I'm super busy and will catch up on this soon!

davidnewhall commented 1 year ago

Found this in someone else's log.

unpackerr-2023-02-06T18-44-33.236.log:2023/02/06 03:27:53 Extraction Error: Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD: failed to open iso image: /downloads/tv-sonarr/Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD/ifpd-pokemonoriginss01-bluray.iso: volume descriptor "BEA01" != "CD001"
davidnewhall commented 1 year ago

Man, this is a rough one. I thought, back when you opened this issue, I found another ISO library for Go (seo: golang). Today, I'm only finding 3 libraries, and I seem to be using the 'best' one. None of them support Joilet file extensions, which means file names over 32 characters are out. The bugs you've run into seem to be directly in this library. I don't believe I can fix them myself. I'm also afraid the 4 GB file limitation is built into the library, but I think it may be inadvertently used for extractions when it should be used for compression. Not entirely sure yet.

Question for ya @blackwind .. if I give you a spot to upload, can you send me an ISO file or two that didn't work? I'll try to engage with @kdomanski once I have a reproducible example to share with him.

This is the library I'm using now:

These are the other two I found:

EDIT: Found more that are 2+ years old:

If anyone find a good ISO9660 library for Go.. lemme know.

blackwind commented 1 year ago

Proper support for these files will be a huge time-saver for me, so absolutely, I'm happy to help in any way I can. The one I used in my log (X-Blades_HD-DINOByTES) is a good example of all mentioned issues and is available in the obvious places, but I'll do the legwork if you need me to for whatever reason.

davidnewhall commented 1 year ago

I'll try that file with a few of these libraries. Will see if anything can extract it.

kdomanski commented 1 year ago

Found this in someone else's log.

unpackerr-2023-02-06T18-44-33.236.log:2023/02/06 03:27:53 Extraction Error: Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD: failed to open iso image: /downloads/tv-sonarr/Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD/ifpd-pokemonoriginss01-bluray.iso: volume descriptor "BEA01" != "CD001"

That's a UDF descriptor.

davidnewhall commented 1 year ago

It's funny, because the ISO9660 library I currently use just released a new version. Literally the only significant change they made was to add an error that says "UDF volumes are not supported." rip. I'm messing with this a little bit today, but I'm not very optimistic. :(

davidnewhall commented 1 year ago

I'm stumped at this point. The only actively maintained libraries I can find do not support 2 or more of:

I did find 1 old library that has Rock Ridge support, and 1 that supports UDF, but I don't think I found any that support Joliet.

Ideally, the rock ridge support can be ported into https://github.com/kdomanski/iso9660 or https://github.com/diskfs/go-diskfs or both.

kdomanski commented 1 year ago

Funny indeed. Maybe the author got a notification when you mentioned him, and saw your post.

The 1 library with Rock Ridge support that you linked, it only gets the full filename from the RR data, but not timestamps. Looks like the master branch of the library you use already has RR test fixture added, so full RR support might drop any time. Supporting Joliet might then be redundant for your usecase, we'll see.

As for files larger than 4GB, this requires support for multi-extent descriptors. It's not hard to implement, but it requires a bit of free time.

davidnewhall commented 1 year ago

Funny indeed. Maybe the author got a notification when you mentioned him, and saw your post.

Love it. Thanks for stopping by!

Looks like the master branch of the library you use already has RR test fixture added,

haha, don't be so modest. I see your recent commits (now), and am very pleased!

Supporting Joliet might then be redundant for your usecase

I hope so. Seems like rock ridge will give us what we're missing.

As for files larger than 4GB, this requires support for multi-extent descriptors.

Give me a pointer or two? I'm willing to try if you think it might be a worthwhile use of my time. This is probably the last "hurdle."

EDIT: derp moment. Just realized who I replied to earlier. haha EDIT2: and now realizing the new release you made was probably because of this issue, and that error message you quoted. Thank you :)

kdomanski commented 1 year ago

Give me a pointer or two? I'm willing to try if you think it might be a worthwhile use of my time. This is probably the last "hurdle."

ECMA-119 9.1.6: multi-extent flag. ECMA-119 6.5.1 "Each file shall consist of one or more File Sections."

It's not very explicit, but I infer that maybe it means a multi-extent file has several consecutive Directory Records and the flag turned on.

The Linux Kernel's code for this appears to interpret this flag as an indication of the given DE not being the last one for the file.

Supporting Joliet might then be redundant for your usecase

I hope so. Seems like rock ridge will give us what we're missing.

Looks like (outside of some edge cases) Linux will use RR and ignore Joliet if both are present.

davidnewhall commented 1 year ago

@kdomanski You're right, overall this doesn't look too hard. It's going to take me a bit to come up to speed on this, but I've got a couple hours into it now and may be able to get there. Here's where I'm at...

None of the files have dirFlagMultiExtent set in their FileFlags. This image doesn't actually seem to have any files larger than 4 GB, so I will keep looking for one.

de.SystemUse is also empty, so I don't seem to get Rock Ridge files names. It could be that this image has two volumes and doesn't do rock ridge. Have you figured out how to access that second volume yet?

Here's my "update" to do some debugging: https://github.com/kdomanski/iso9660/commit/45c0c7da7dddbfbecfe7910a071b01d36e0e7080

I ran this new code against the ISO file mention earlier in the thread. Here's the whole output: https://gist.github.com/davidnewhall/b67c6fdf1c942fb8d8026ba1a42fad25

This is what it looks like mounted on my Mac:

Screen Shot 2023-05-18 at 1 54 09 AM

...which makes me want to ask: Is the volume name exposed by this library yet? (the name in the title)

kdomanski commented 1 year ago

Hmm, this might be a Joliet-only image. I'll look into the dump you provided.

Is the volume name exposed by this library yet? (the name in the title)

it is now. ;-) https://github.com/kdomanski/iso9660/releases/tag/v0.3.5

davidnewhall commented 1 year ago

amazing!

blackwind commented 1 year ago

Any further progress on this? Or are we blocked indefinitely?

davidnewhall commented 1 year ago

No one has ever extracted or created these 'advanced' format ISO images with Go apps. This is all new. kdomanski is the only person that's put together a comprehensive library that will one day provide these features. Today, it does not. I haven't had time to visit this. I have dozens of projects, and this feature is a lot of work, so it will be a while before I'm intrigued enough to spend the time required.

There has been no further progress at this time.

blackwind commented 1 year ago

If you detected impatience in my tone, none was intended. I appreciate the update and all the work done on this so far.

kdomanski commented 1 year ago

Sup. Release v0.4.0 can read Rock Ridge filenames. Looking forward to your feedback (and bug reports 😉 ).

davidnewhall commented 1 year ago

There's probably more I can do here, but I updated the library and pushed some updates. You can download it here https://unstable.golift.io - thanks Kamil!

EDIT: Docker is ready.

blackwind commented 1 year ago

Unable to test until the Docker image is available, but it sounds like no more filename truncation, no more incorrect filename casing, but the other issues persist. I've marked the completed tasks in the first post.

blackwind commented 1 year ago

Currently just getting "UDF volumes are not supported", which I guess is an improvement over the old behavior.

davidnewhall commented 1 year ago

What is the image you're testing? UDF is probably another problem that needs a solution.

blackwind commented 1 year ago

Tried a few, but Stray.v1.5-Razor1911 is a well sized one for testing the 4GB issue as well.