kaitai-io / kaitai_struct

Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
https://kaitai.io
4.04k stars 199 forks source link

`_root` / `_parent` chain broken by recursive use of the top-level type #1089

Closed generalmimon closed 8 months ago

generalmimon commented 8 months ago

If the top-level type is used recursively, _root (and _parent in some target languages) refers to the closest ancestor in the object tree that has the top-level type, not to the actual root of the object tree.

Consider the following .ksy snippet:

meta:
  id: nav_root_recursive
seq:
  - id: value
    type: u1
  - id: next
    type: nav_root_recursive
    if: value != 0x01
instances:
  root_value:
    value: _root.value

If you try this for example on bytes 80 01, you get the following object tree (as of KSC at commit 6aef6fae):

├── value = 0x80 (= 128)
├── next [NavRootRecursive]
│   ├── value = 0x1 (= 1)
│   └── rootValue = 0x1 (= 1)
└── rootValue = 0x80 (= 128)

As you can see, the value of next.rootValue is 1, which indicates that _root in next's context refers to next itself, not the root node of the object tree. _root is not passed from the root object to next, so next falls back to using this as its _root.

This is visible in the generated code, see JavaScript for example:

  NavRootRecursive.prototype._read = function() {
    this.value = this._io.readU1();
    if (this.value != 1) {
      this.next = new NavRootRecursive_.NavRootRecursive(this._io, this, null);
    }
  }

Notice the null being passed as an argument to _root (instead of this._root as usual).

In some languages, the same problem affects not only _root but also _parent - see e.g. Java:

    private void _read() {
        this.value = this._io.readU1();
        if (value() != 1) {
            this.next = new NavRootRecursive(this._io);
        }
    }

The _parent problem can be demonstrated for example as follows (again, this is meant to work on bytes like 80 01, but it can be easily adjusted for pretty much any binary file):

meta:
  id: nav_parent_recursive
seq:
  - id: value
    type: u1
  - id: next
    type: nav_parent_recursive
    if: value == 0x80
instances:
  parent_value:
    value: _parent.as<nav_parent_recursive>.value
    if: value != 0x80

Accessing the next.parent_value instance would fail in Java, since it doesn't pass _parent to invocations of the top-level type (so next._parent would be null).