ids1024 / iso9660-rs

Rust library for reading iso9660 filesystems
Apache License 2.0
22 stars 5 forks source link

Don't fail if filename doesn't have a version number #1

Closed losynix closed 2 years ago

losynix commented 2 years ago

Hello, thank you for your work on this crate.

I used it to read an old ISO of Windows XP and it failed because filenames doesn't have a version number at the end. Maybe microsoft only implemented a subset of the standard(?). Anyway here is a patch that assume '1' as version number if it's not present instead of returning an error. (I could read the ISO successfully with this).

Thanks again

ids1024 commented 2 years ago

Looking at the ECMA-119 spec, volume descriptors should always have a version, but "enhanced volume descriptors" do not. So perhaps the file is conformant, but this library needs to recognize and handle "enhanced volume descriptors"?

losynix commented 2 years ago

Yes you're right. From what I understand of the standard, Primary Volume Descriptor and Supplementary/Enhanced Volume Descriptor are quite similar except the file identifier format: the Primary requires NAME[SEP1]EXTENSION[SEP2]VERSION and no particular requirements for the Supplementary but it may use a different character set (G0 or G1 of ISO2022) if Escape Sequences are present at BP 89-120.

But in the case of this windows xp ISO, even if we're only parsing Primary Volume Descriptor (Supplementary/Enhanced is not handled, see type_code '2' in src/parse/volume_descriptor.rs), file identifiers don't include version. I think this is because the ISO uses the Joliet specification which lifts the requirements on file identifiers even for the Primary Volume Descriptor.

While I guess there would be more work to be done to completely support the Joliet specification I thought we could be tolerant here since it doesn't break the lib for ISOs following the vanilla standard while still allowing to read (some at least) Joliet ISOs with a quick patch. I agree this is not ideal, the specification should be fully implemented but I don't have time to do it at the moment.

ids1024 commented 2 years ago

Ah right, the library still lacks Joliet and Rock Ridge support, among all the other things. Though I think that should be backwards compatible... For instance, as described by Wikipedia:

Joliet accomplishes this by supplying an additional set of filenames that are encoded in UCS-2BE (UTF-16BE in practice since Windows 2000). These filenames are stored in a special supplementary volume descriptor, that is safely ignored by ISO 9660-compliant software, thus preserving backward compatibility.

That sounds different from what's going on here.

ids1024 commented 2 years ago

Perhaps this is the non-backwards compatible "Romeo" extension, rather than "Joliet", which apparently is also supported by Windows 95+.

losynix commented 2 years ago

I agree this is weird because I'm only reading the Primary Volume Descriptor.. I got more information with the isoinfo command from cdrtools:

$ isoinfo -i Windows_XP_Pro_SP3_FR.iso -d
CD-ROM is in ISO 9660 format
System id: LINUX
Volume id: GRTMPVOL_FR
Volume set id:
Publisher id:
Data preparer id:
Application id: GENISOIMAGE ISO 9660/HFS FILESYSTEM CREATOR (C) 1993 E.YOUNGDALE (C) 1997-2006 J.PEARSON/J.SCHILLING (C) 2006-2007 CDRKIT TEAM
Copyright File id:
Abstract File id:
Bibliographic File id:
Volume set size is: 1
Volume set sequence number is: 1
Logical block size is: 2048
Volume size is: 315686
El Torito VD version 1 found, boot catalog is in sector 685

Joliet with UCS level 3 found.
No SUSP/Rock Ridge present
Eltorito validation header:
    Hid 1
    Arch 0 (x86)
    ID ''
    Cksum AA 55 OK
    Key 55 AA
    Eltorito defaultboot header:
        Bootid 88 (bootable)
        Boot media 0 (No Emulation Boot)
        Load segment 7C0
        Sys type 0
        Nsect 4
        Bootoff 2AE 686

No sign of Romeo extension but Joliet seems to be used. The file was created with genisoimage and according to the man page it has an option to omit version in filenames :

-N
    Omit version numbers from ISO9660 filenames.
    This violates the ISO9660 standard, but no one really uses the version numbers anyway. Use with caution. 

Maybe this ISO was generated without version in filenames explicitly (but it does respect the other rules of the standard Primary Volume Descriptor e.g. filenames are restricted to uppercase letters, numbers and underscores).

ids1024 commented 2 years ago

Okay, so no particular extension but just a popular way to violate the standard. Then I guess this change is reasonable is.

Ah, standards. Too bad no one follow them (though I have no idea why they though files on CDs needed "versions" anyway).