Open vadimkantorov opened 4 weeks ago
Trying another library https://clalancette.github.io/pycdlib/, I am getting the following (README
gets printed correctly)!:
For the introductory information to TeX Live, see the directories
readme-txt.dir (plain text files) or readme-html.dir/ (HTML files).
The material is available in several languages.
import sys, io
import pycdlib
iso = pycdlib.PyCdlib()
iso.open('texlive2024-20240312.iso')
for child in iso.list_children(joliet_path='/'):
print(child.file_identifier(), child.orig_extent_loc * iso.logical_block_size, child_data_length)
extracted = io.BytesIO()
iso.get_file_from_iso_fp(extracted, joliet_path='/README')
print(extracted.getvalue().decode('utf-8'))
iso.close()
I compared the block index from orig_extent_loc
and from tb9660
and it appears they are the same (assuming block size 2048), so quite unclear why tb9660
prints gibberish :(
I just tried out this library and ran into the same issue. Turns out as per the ISO 9660 spec, filenames can end in a semicolon followed by a version number, so README.TXT;1
is to be expected. If you try tb9660 <iso_name> cat README.TXT
I believe you should get the expected result.
Probably when using this library you would want to just truncate any part of the filename including and after the semicolon.
Well, for filenames it's the story of Joliet/RockRidge extensions and so forth.
L9660_DEBUG=1 ./tb9660 ../texlive2024-20240312.iso cat 'README.TXT'
does not print anything useful.
L9660_DEBUG=1 ./tb9660 ../texlive2024-20240312.iso cat 'README.;1'
does something non-trivial and prints the offset/size correctly, but doesn't print the contents correctly - instead prints something scrambled :(
Hi!
First, thanks for sharing your ISO format library. I'm a big fan of such approach and of single-file projects, e.g. https://github.com/richgel999/miniz/.
I'm trying to use
tb9660
to work with large TexLive distribution in ISO (being able to mmap/open files from these large ISOs would be great and save time/space for extracting them when one does not have mount permissions): https://tug.ctan.org/systems/texlive/Images/texlive2024-20240312.isoCalling
./tb9660 ../texlive2024-20240312.iso ls .
gives:seems that
tb9660
does not auto-detect Joliet/RockRidge?I've also tried printing the
README
file as./tb9660 ../texlive2024-20240312.iso cat 'README.;1'
, and this prints gibberish:while in fact the
README
file contains the following:Calling it as
L9660_DEBUG=1 ./tb9660 ../texlive2024-20240312.iso cat 'README.;1'
gives:so
tb9660
computes correctly the README file size as 182, but probably (?) calculates offset incorrectly and prints contents of some other file? (in the comment below, I found that the sector index is indeed 2722717, so if using block size 2048 it would give the correct byte offset 5576124416 - so unclear whytb9660
prints gibberish :(Can one use nice Joliet file names as input to
cat
or at least have them decoded bylib9660
structures (essentially I'll need to get file offsets by a proper UTF-8 Joliet name/path)? Or is Joliet not supported at all?Thank you!