kaitai-io / kaitai_struct

Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
https://kaitai.io
4.04k stars 199 forks source link

Error while parsing: attempted to read 1 bytes, got only 0 #1071

Closed bersbersbers closed 1 year ago

bersbersbers commented 2 years ago

The following format breaks parsing on byte0:

[-] [root]                                                                                                               00000000: 00 00 00 00 00 00 00 
  [-] data                                                                                                               00000010: 
    [?] byte0                                                                                                            00000020: 
    [-] byte1                                                                                                            00000030: 
      [.] ╔══════════════════════════════════════════════════════════════════════════════ Error while parsing ═════════════════════════════════════════════════════════════════════════════╗
          ║attempted to read 1 bytes, got only 0                                                                                                                                           ║
          ║                                                                                                                                                                                ║
          ║                                                                                                                                                                                ║
          ║                                                                                                                                                                                ║
          ║                                                                                                                                                                                ║
          ║                                                                                                                                                                                ║
          ║                                                                                                                                                                                ║
          ║                                                                              [ OK ]                                                                                            ║
          ║                                                                                                                                                                                ║
          ╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
meta:
  id: bug
seq:
  - id: data
    type: twobytes
types:
  twobytes:
    instances:
      byte0:
        type:
          switch-on: byte1.bit0
          cases:
            true: u1
            false: u1

      byte1:
        pos: 1
        type: byte1
  byte1:
    seq:
      - id: bit0
        type: b1

This one works, by contrast:

meta:
  id: bug
seq:
  - id: data
    type: twobytes
types:
  twobytes:
    seq:
      - id: byte0
        type:
          switch-on: byte1.bit0
          cases:
            true: u1
            false: u1
    instances:
      byte1:
        pos: 1
        type: byte1
  byte1:
    seq:
      - id: bit0
        type: b1

And most interestingly, both work in https://ide.kaitai.io/

generalmimon commented 1 year ago

Duplicate of https://github.com/kaitai-io/kaitai_struct/issues/544

generalmimon commented 1 year ago

See also https://github.com/kaitai-io/kaitai_struct/issues/1030#issuecomment-1517704699:

I guess the general consensus is that "parse" instances without pos must not do directly any parsing (although the compiler again doesn't implement any checks to enforce that, but hopefully it will in the future). So they shouldn't be allowed to use any types that permanently change the state of the current stream.

In this case, since you didn't specify any pos for the byte0 parse instance:

    instances:
      byte0:
        # no `pos` here!
        type:
          switch-on: byte1.bit0
          cases:
            true: u1
            false: u1

... it will be parsed from an arbitrary "current" position the stream is currently at. In ksv, this happens to be 2 (i.e. at the end of stream), in Web IDE it's 0, so that's the reason it works there and not in ksv.

Also, even though it's not related to the error that you're getting, avoid combinations of instances and bit-sized integers (like b1):

    instances:
      byte1:
        pos: 1
        type: byte1

  byte1:
    seq:
      - id: bit0
        type: b1

See https://github.com/kaitai-io/kaitai_struct/issues/1030#issuecomment-1517826333 for explanation:

(...) at the moment, unaligned bit positions and instances do not work well together at all. See #564: parse instances (with pos) save and restore only byte position, but not bit position.

As a result, to avoid undefined behavior when writing .ksy specs, parsing of an instance shouldn't be triggered when the stream from which the instance is parsed is at an unaligned bit position, and an instance should not leave the stream behind at an unaligned bit position after it has done its parsing. Otherwise weird things may happen.

Of course, this will be eventually fixed in Kaitai Struct to support these situations properly too (there's no reason it shouldn't work, it's simply a bug/oversight that unintentionally limits expressiveness of the KS language).