kaitai-io / kaitai_struct_formats

Kaitai Struct: library of binary file formats (.ksy)
http://formats.kaitai.io
711 stars 204 forks source link

Dir Entires limit 50 #679

Open MardanovaA opened 1 year ago

MardanovaA commented 1 year ago

Hello. I am using kaitai to structure iso file using this article (javascript). But I have a problem with a folder that has more than 50 files. If i use code like var parsedIso = new Iso9660(new KaitaiStream(arrayBuffer)); const myDir = parsedIso.primaryVolDesc.volDescPrimary.rootDir.body.extentAsDir.entries; console.log(myDir.length) It is returns only first 50 files, but i have more files in directory. Other directories are good.

Please, any help?

generalmimon commented 1 year ago

@MardanovaA:

It is returns only first 50 files, but i have more files in directory. Other directories are good.

I can confirm that I can reproduce the issue - I generated a sample .iso with 100 files in the root directory like this:

pp@DESKTOP-89OPGF3:/mnt/c/temp/ksf-samples/iso9660/issue-1040$ mkdir contents
pp@DESKTOP-89OPGF3:/mnt/c/temp/ksf-samples/iso9660/issue-1040$ cd contents
pp@DESKTOP-89OPGF3:/mnt/c/temp/ksf-samples/iso9660/issue-1040/contents$ for n in {00..99}; do echo "$n" > "$n".txt; done
pp@DESKTOP-89OPGF3:/mnt/c/temp/ksf-samples/iso9660/issue-1040/contents$ ls
00.txt  10.txt  20.txt  30.txt  40.txt  50.txt  60.txt  70.txt  80.txt  90.txt
01.txt  11.txt  21.txt  31.txt  41.txt  51.txt  61.txt  71.txt  81.txt  91.txt
02.txt  12.txt  22.txt  32.txt  42.txt  52.txt  62.txt  72.txt  82.txt  92.txt
03.txt  13.txt  23.txt  33.txt  43.txt  53.txt  63.txt  73.txt  83.txt  93.txt
04.txt  14.txt  24.txt  34.txt  44.txt  54.txt  64.txt  74.txt  84.txt  94.txt
05.txt  15.txt  25.txt  35.txt  45.txt  55.txt  65.txt  75.txt  85.txt  95.txt
06.txt  16.txt  26.txt  36.txt  46.txt  56.txt  66.txt  76.txt  86.txt  96.txt
07.txt  17.txt  27.txt  37.txt  47.txt  57.txt  67.txt  77.txt  87.txt  97.txt
08.txt  18.txt  28.txt  38.txt  48.txt  58.txt  68.txt  78.txt  88.txt  98.txt
09.txt  19.txt  29.txt  39.txt  49.txt  59.txt  69.txt  79.txt  89.txt  99.txt
pp@DESKTOP-89OPGF3:/mnt/c/temp/ksf-samples/iso9660/issue-1040/contents$ cd ..
pp@DESKTOP-89OPGF3:/mnt/c/temp/ksf-samples/iso9660/issue-1040$ genisoimage -o test.iso contents/
I: -input-charset not specified, using utf-8 (detected in locale settings)
Total translation table size: 0
Total rockridge attributes bytes: 0
Total directory bytes: 0
Path table size(bytes): 10
Max brk space used 0
276 extents written (0 MB)
pp@DESKTOP-89OPGF3:/mnt/c/temp/ksf-samples/iso9660/issue-1040$ genisoimage --version
genisoimage 1.1.11 (Linux)
pp@DESKTOP-89OPGF3:/mnt/c/temp/ksf-samples/iso9660/issue-1040$ uname -a
Linux DESKTOP-89OPGF3 5.15.90.1-microsoft-standard-WSL2 kaitai-io/kaitai_struct#1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Result (gzipped): test.iso.gz

Then I loaded this .iso file to the Web IDE with iso9660.ksy, but unfortunately the YAML parser embedded in the Web IDE screwed up again (the bug looks similar to this) and I had to patch the .ksy spec because of that:

 instances:
   sector_size:
     value: 2048
   primary_vol_desc:
-    pos: 0x010 * sector_size
+    pos: '0x010 * sector_size'
     type: vol_desc

After that, primaryVolDesc.volDescPrimary.rootDir.body.extentAsDir.entries indeed contains only 50 entries as you describe. The last entry (at index 49) has len = 0, which matches the terminating condition for parsing dir_entries:

Web IDE screenshot with iso9660_modified.ksy spec and test.iso sample file with 100 file entries in the root directory

iso9660.ksy:131-136

  dir_entries:
    seq:
      - id: entries
        type: dir_entry
        repeat: until
        repeat-until: _.len == 0

I'm not really familiar with the ISO 9660 format, but since in my test file the directory entries start at 0xb800 and end at 0xbffa (as you can see in the screenshot), and the size of one sector that ISO 9660 works with happens to be 0x800 (2048), so the last byte of the sector that starts at 0xb800 is at position 0xb800 + 0x800 - 1 = 0xbfff.

So it appears to me that the format tries to fit as many directory entries as possible into one sector, and when there isn't enough space in the sector to start another entry, it fills the rest of the sector with zeros (00 bytes) and continues in the next sector. Unfortunately, this apparently hasn't been implemented in the iso9660.ksy spec yet (it only reads the entries in the first sector, then it reaches the padding and stops).

generalmimon commented 1 year ago

https://github.com/kaitai-io/kaitai_struct_formats/pull/82 should improve the ISO 9660 support, but according to this comment by @armijnhemel, it doesn't seem to address this particular issue yet:

          - id: directory_record
            type: directory_record
            repeat: until
            repeat-until: _.len_dr == 0

This is actually not correct, as there can be additional padding bytes. Section 6.8.1.1 of the ISO9660 spec says:

Each Directory  Record  shall  end  in  the  Logical  Sector  in  which  it  begins.  Unused  byte  positions  after  the  last Directory Record in a Logical Sector shall be set to (00).

So this means that you also need to look at the remaining bytes in the logical sector as well as the total length of the directory records.