kaitai-io / kaitai_struct

Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
https://kaitai.io
3.97k stars 194 forks source link

_parent breaks when nested below a switch statement #48

Closed kouak closed 7 years ago

kouak commented 7 years ago

This ksy breaks :

meta:
  id: asterix
seq:
  - id: asterix
    type: asterix_block
    repeat: eos
types:
  asterix_block:
    seq:
      - id: category
        type: u1
      - id: asterix_len
        type: u2be
      - id: content
        size: asterix_len - 3
        type:
          switch-on: category
          cases:
            '30': asterix_cat030
  asterix_cat030:
    seq:
      - id: fspec
        type: u1
        repeat: until
        repeat-until: _ & 0x1 == 0
      - id: cat030_content
        type: asterix_cat030_content
  asterix_cat030_content:
    seq:
      - id: item_010
        size: 2
        if: _parent.fspec[0] & 0x80 != 0
Exception in thread "main" scala.MatchError: UnknownClassSpec (of class io.kaitai.struct.format.UnknownClassSpec$)
    at io.kaitai.struct.ClassTypeProvider.makeUserType(ClassTypeProvider.scala:51)
    at io.kaitai.struct.ClassTypeProvider.determineType(ClassTypeProvider.scala:24)
    at io.kaitai.struct.ClassTypeProvider.determineType(ClassTypeProvider.scala:16)
    at io.kaitai.struct.translators.BaseTranslator.detectType(BaseTranslator.scala:228)
    at io.kaitai.struct.translators.BaseTranslator.translate(BaseTranslator.scala:69)
    at io.kaitai.struct.languages.components.ObjectOrientedLanguage$class.expression(ObjectOrientedLanguage.scala:9)
    at io.kaitai.struct.languages.RubyCompiler.expression(RubyCompiler.scala:11)
    at io.kaitai.struct.languages.RubyCompiler.condIfHeader(RubyCompiler.scala:172)
    at io.kaitai.struct.languages.components.EveryReadIsExpression$class.attrParse(EveryReadIsExpression.scala:23)
    at io.kaitai.struct.languages.RubyCompiler.attrParse(RubyCompiler.scala:11)
    at io.kaitai.struct.ClassCompiler$$anonfun$compileClass$3.apply(ClassCompiler.scala:44)
    at io.kaitai.struct.ClassCompiler$$anonfun$compileClass$3.apply(ClassCompiler.scala:44)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at io.kaitai.struct.ClassCompiler.compileClass(ClassCompiler.scala:44)
    at io.kaitai.struct.ClassCompiler$$anonfun$compileSubclasses$1.apply(ClassCompiler.scala:86)
    at io.kaitai.struct.ClassCompiler$$anonfun$compileSubclasses$1.apply(ClassCompiler.scala:86)
    at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:221)
    at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428)
    at io.kaitai.struct.ClassCompiler.compileSubclasses(ClassCompiler.scala:86)
    at io.kaitai.struct.ClassCompiler.compileClass(ClassCompiler.scala:59)
    at io.kaitai.struct.ClassCompiler.compile(ClassCompiler.scala:19)
    at io.kaitai.struct.Main$.compileOne(Main.scala:92)
    at io.kaitai.struct.Main$.compileOne(Main.scala:88)
    at io.kaitai.struct.Main$$anonfun$main$1.apply(Main.scala:124)
    at io.kaitai.struct.Main$$anonfun$main$1.apply(Main.scala:119)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at io.kaitai.struct.Main$.main(Main.scala:119)
    at io.kaitai.struct.Main.main(Main.scala)

This one works fine :

meta:
  id: asterix
seq:
  - id: asterix
    type: asterix_block
    repeat: eos
types:
  asterix_block:
    seq:
      - id: category
        type: u1
      - id: asterix_len
        type: u2be
      - id: content
        size: asterix_len - 3
        type: asterix_cat030
        if: category == 30
  asterix_cat030:
    seq:
      - id: fspec
        type: u1
        repeat: until
        repeat-until: _ & 0x1 == 0
      - id: cat030_content
        type: asterix_cat030_content
  asterix_cat030_content:
    seq:
      - id: item_010
        size: 2
        if: _parent.fspec[0] & 0x80 != 0

I don't know wether I'm misusing Kaitai-struct or it's bug.

For reference, I'm trying to use Kaitai-struct to parse ASTERIX data.

Specification of the binary format is here : https://www.eurocontrol.int/sites/default/files/field_tabs/content/documents/single-sky/specifications/20120401-asterix-spec-v2.0.pdf

GreyCat commented 7 years ago

Thanks! From what I see so far, you're right, it's legit .ksy and probably a bug in parent type propagation in ksc. I'll look into it.

GreyCat commented 7 years ago

I've added new test (nav_parent_switch) and applied the fixes to compiler to make it run. It should be ok now: http://kaitai.io/ci/ shows that it passes, I'll add C# spec soon too.

I've updated http://kaitai.io/repl as well. Please check if it works for you :)

kouak commented 7 years ago

Looks good !

Here's my (on going) implementation of ASTERIX : https://gist.github.com/kouak/1c0317fa1293cfbe8195386820df5ec9

This file compiles fine on the repl.

GreyCat commented 7 years ago

Thanks! By any chance, would you want to contribute this implementation to https://github.com/kaitai-io/kaitai_struct_formats ?

kouak commented 7 years ago

I'd be more than happy to do so !

However the format I'm working relies heavily on non standard structures ... Some integers are coded on 3 bits (since their value is in the 0-7 range), multiple subfields are single bits. Once non-byte fields are baked into kaitai, I guess I'll consider moving beyond the "pet projet" status and possible use kaitai to generate a production grade parser.

Another issue is that this format is very wide. The gist I pasted above only treats a single category (CAT030). There are over 30 categories in ASTERIX. This leads to a side question : is there any way to split a format definition across multiple files ?

If you happen to be interested in this format, I could provide detailed specifications and support to exploit them, since they make extensive use of domain specific vocabulary (air traffic jargon mostly).

GreyCat commented 7 years ago

One of the big goals (and certainly a very ambitious one) envisioned for Kaitai Struct is to gather a free/open source library of file formats / protocols specifications. It would be fundamentally different from other projects like https://www.fileformat.info/ http://www.zamzar.com/fileformats/ http://fileformats.archiveteam.org/ in one simple way: we won't be just gathering text, human-readable specs and links to various ad-hoc implementations, but want to strive for having a formal spec in .ksy language, understandable by both humans and machines (compilable into the parsers in a variety of target programming languages).

In that spirit, of course any format descriptions are very welcome.

Some integers are coded on 3 bits (since their value is in the 0-7 range), multiple subfields are single bits. Once non-byte fields are baked into kaitai, I guess I'll consider moving beyond the "pet projet" status and possible use kaitai to generate a production grade parser.

Technically, if these values are always packed into bytes on the same bit places, you can use value instances right now to do that:

seq:
  - id: b1
    type: u1
instances:
  value:
    integer_in_bits_0_to_2:
      value: 'b1 >> 5'
    bit_3_as_int:
      value: '(b1 & 0b00010000) >> 4'
    bit_3_as_boolean:
      value: '(b1 & 0b00010000) != 0'
    integer_in_bits_4_to_7:
      value: 'b1 & 0b00001111'

is there any way to split a format definition across multiple files ?

Yes, it's already possible. Actually, you don't need to do anything special for that. You can just have several files with several top-level types, and you can reference one from the other:

format_a.ksy

meta:
  id: format_a
seq:
  - id: code
    type: u1
  - id: body
    type: format_b

format_b.ksy

meta:
  id: format_b
seq:
  - id: foo
    type: u1

For a working production example, take a look at how files in https://github.com/kaitai-io/kaitai_struct_formats/tree/master/network reference each other (pcap → ethernet_frame → ipv4_packet → tcp_segment).