kaitai-io / kaitai_struct

Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
https://kaitai.io
4.02k stars 197 forks source link

Using instance in repeat-until to peek the next X bits #580

Open kalidasya opened 5 years ago

kalidasya commented 5 years ago

Hey, I have the following usecase. In mpeg2 streams the start indicator is a 24 bit 1 value. I have to read the stream byte aligned so I though I can do something like this:

meta:
  id: mpeg2_video
  file-extension: mp2
  endian: le
seq:
  - id: prefix
    type: u1
    repeat: until
    repeat-until: prefix_code != 1
  - id: ignore
    type: b24
instances:
  prefix_code:
    io: _root._io
    pos: _root._io.pos
    size: b24

this does not seem to work as I get an error during compile (to python): mpeg2_video: /seq/0/repeat-until: can't compare BytesLimitType(Name(identifier(b24)),None,false,None,None) and Int1Type(true) probably I do something wrong, but I saw now other way to peek the coming bits/bytes. I tried to do .to_i on prefix_code but it seems BytesLimitType has no methods. Is there any way to achieve this currently?

kalidasya commented 5 years ago

@ams-tschoening b24 should be 1, can I compare a byte array with something hardcoded? like [0,0,1] ? Will try to change the instance to have that value calculation

ams-tschoening commented 5 years ago

Deleted the former comment as I simply missed the new b...-syntax.

Your problem most likely is that prefix_code results in a byte array instead of an integer you can compare to 1 because of using size. Try changing size to type in your instance.

https://doc.kaitai.io/ksy_reference.html#_byte_array_keys

Instance specification is very close to Attribute spec (and inherits all its properties)[...]

https://doc.kaitai.io/ksy_reference.html#spec-instance

kalidasya commented 5 years ago

@ams-tschoening you were right if I change it to type in the instance it works:

  start_code_sync:
    seq:
      - id: prefix
        type: u1
        repeat: until
        repeat-until: prefix_code != 1
      - id: ignore
        type: b16
      - id: start_code
        type: u1
    instances:
      prefix_code:
        io: _parent._io
        pos: _parent._io.pos
        type: b24

but something is still weird. If I use this type in a root sequence it is all good, but if I use it in a sub-type like:


  id: mpeg2_video
  file-extension: mp2
  endian: le
seq:
  - id: start_code
    type: start_code_sync
  - id: data
    type:
      switch-on: start_code.as<start_code_sync>.start_code
      cases:
        0xb3: sequence_header
        0x00: picture

types:
  start_code_sync:
    seq:
      - id: prefix
        type: u1
        repeat: until
        repeat-until: prefix_code != 1
      - id: ignore
        type: b16
      - id: start_code
        type: u1
    instances:
      prefix_code:
        io: _parent._io
        pos: _parent._io.pos
        type: b24
  sequence_header:
    seq:
      - id: horiz_size
        type: b12
      - id: vert_size
        type: b12
      - id: aspect_ratio_information
        type: b4
      - id: framerate_code
        type: b4
      - id: bitrate
        type: b18
      - id: marker_bit
        # contents: [true]
        type: b1
      - id: vbv_buffer_size
        type: b10
      - id: constrained_param_flag
        # contents: false
        type: b1
      - id: load_intra_quantiser_matrix
        type: b1
      - id: intra_quantiser_matrix
        type: b8
        repeat: expr
        repeat-expr: 64
        if: load_intra_quantiser_matrix
      - id: load_non_intra_quantiser_matrix
        type: b1
      - id: non_intra_quantiser_matrix
        type: b8
        repeat: expr
        repeat-expr: 64
        if: load_non_intra_quantiser_matrix
      - id: start_code                     # if u comment these out the compilation error is gone
        type: start_code_sync
    instances:
      framerate_fraction:
        value: >-
          framerate_code == 0 ? 0 / 1.0 :
          framerate_code == 1 ? 24000 / 1001.0 :
          framerate_code == 2 ? 24 / 1.0 :
          framerate_code == 3 ? 25 / 1.0 :
          framerate_code == 4 ? 30000 / 1001.0 :
          framerate_code == 5 ? 30 / 1.0 :
          framerate_code == 6 ? 50 / 1.0 :
          framerate_code == 7 ? 60000 / 1001.0 :
          framerate_code == 8 ? 60 / 1.0 : 0
  picture:
    seq:
      - id: temporal_reference
        type: b10
      - id: picture_coding_type
        type: b3
      - id: vbv_delay
        type: b16
      # pictore_coding_type cannot be 0
      - id: full_pel_forward_vector
        type: b1
        if: picture_coding_type == 2 or picture_coding_type == 3
      - id: forward_f_code
        type: b1
        if: picture_coding_type == 2 or picture_coding_type == 3
      - id: full_pel_backward_vector
        type: b1
        if: picture_coding_type == 3
      - id: backward_f_code
        type: b1
        if: picture_coding_type == 3
``` the compile (to python) gives me the following error:
`KaitaiStructType (of class io.kaitai.struct.datatype.DataType$KaitaiStructType$)`
or in the web ide:
```Call stack: Error
    at $c_s_MatchError.fillInStackTrace__jl_Throwable (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:20102:14)
    at $c_s_MatchError.init___T__jl_Throwable (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:20124:8)
    at $c_s_MatchError.init___O (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:35982:52)
    at $f_Lio_kaitai_struct_translators_CommonMethods__translateAttribute__Lio_kaitai_struct_exprlang_Ast$expr$Attribute__O (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:1914:35)
    at $c_Lio_kaitai_struct_translators_JavaScriptTranslator.translate__Lio_kaitai_struct_exprlang_Ast$expr__T (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:44484:18)
    at $f_Lio_kaitai_struct_languages_components_ObjectOrientedLanguage__expression__Lio_kaitai_struct_exprlang_Ast$expr__T (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:1605:140)
    at $c_Lio_kaitai_struct_languages_JavaScriptCompiler.useIO__Lio_kaitai_struct_exprlang_Ast$expr__T (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:61542:197)
    at $f_Lio_kaitai_struct_languages_components_CommonReads__attrParse__Lio_kaitai_struct_format_AttrLikeSpec__Lio_kaitai_struct_format_Identifier__scm_ListBuffer__s_Option__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:1415:83)
    at $c_Lio_kaitai_struct_ClassCompiler.compileInstance__sci_List__Lio_kaitai_struct_format_InstanceIdentifier__Lio_kaitai_struct_format_InstanceSpec__scm_ListBuffer__s_Option__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:18283:5)
    at https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:18203:15
    at $c_sjsr_AnonFunction1.apply__O__O (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:28035:23)
    at $c_sci_Map$Map1.foreach__F1__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:74657:5)
    at $c_Lio_kaitai_struct_ClassCompiler.compileInstances__Lio_kaitai_struct_format_ClassSpec__scm_ListBuffer__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:18197:24)
    at $c_Lio_kaitai_struct_ClassCompiler.compileClass__Lio_kaitai_struct_format_ClassSpec__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:18179:8)
    at https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:18034:15
    at $c_sjsr_AnonFunction1.apply__O__O (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:28035:23)
    at $c_sci_Map$Map3.foreach__F1__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:74861:5)
    at $c_Lio_kaitai_struct_ClassCompiler.compileSubclasses__Lio_kaitai_struct_format_ClassSpec__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:18029:20)
    at $c_Lio_kaitai_struct_ClassCompiler.compileClass__Lio_kaitai_struct_format_ClassSpec__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:18176:10)
    at $c_Lio_kaitai_struct_ClassCompiler.compile__Lio_kaitai_struct_CompileLog$SpecSuccess (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:18081:8)
    at $c_Lio_kaitai_struct_Main$.compile__Lio_kaitai_struct_format_ClassSpecs__Lio_kaitai_struct_format_ClassSpec__Lio_kaitai_struct_languages_components_LanguageCompilerStatic__Lio_kaitai_struct_RuntimeConfig__Lio_kaitai_struct_CompileLog$SpecSuccess (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:3569:13)
    at https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:3690:52
    at $c_sjsr_AnonFunction1.apply__O__O (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:28035:23)
    at $c_s_util_Success.map__F1__s_util_Try (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:46244:47)
    at https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:13405:18
    at $c_sjsr_AnonFunction1.apply__O__O (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:28035:23)
    at $f_s_concurrent_impl_Promise__liftedTree1$1__ps_concurrent_impl_Promise__F1__s_util_Try__s_util_Try (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:26941:31)
    at https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:26933:22
    at $c_sjsr_AnonFunction1.apply__O__O (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:28035:23)
    at $c_s_concurrent_impl_CallbackRunnable.run__V (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:26813:23)
    at $c_sjs_concurrent_QueueExecutionContext$PromisesExecutionContext.scala$scalajs$concurrent$QueueExecutionContext$PromisesExecutionContext$$$anonfun$execute$2__sr_BoxedUnit__jl_Runnable__sjs_js_$bar (https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:33041:16)
    at https://ide.kaitai.io/lib/_npm/kaitai-struct-compiler/kaitai-struct-compiler.js:33058:24 scala.MatchError: KaitaiStructType (of class io.kaitai.struct.datatype.DataType$KaitaiStructType$)```
not sure what do I do wrong
ams-tschoening commented 5 years ago
switch-on: start_code.as<start_code_sync>.start_code

Is the additional cast using as really necessary or only for readability?

the compile (to python) gives me the following error: KaitaiStructType (of class io.kaitai.struct.datatype.DataType$KaitaiStructType$)

Because start_code_sync is used at two places with different parent types, root vs. sequence_header, some of the implicitly available objects to all types like _parent, _root etc. fall back to the most generic KaitaiStruct and you are unable to access some properties you might think are available. I'm somewhat sure this is what the exception is trying to tell you: You assume some specific type somewhere using start_code_sync, but only have KaitaiStruct instead. It might not even be start_code_sync at all, but some other type as well.

  1. enable verbose compiler output using the following:

    "--ksc-exceptions" "--verbose" "all"

  2. check your types for accessing properties which might not be available depending on where a type is used. Especially reconsider things like the following:

    io: _parent._io
    pos: _parent._io.pos

You are explicitly accessing the IO-stream of the parent, but why? If you want to preserve parse position, pos with your own IO-stream is sufficient already. You are assuming too many complicated things, like that your parent didn't invoke you with a substream.

https://doc.kaitai.io/user_guide.html#_instances_data_beyond_the_sequence https://doc.kaitai.io/user_guide.html#_streams_and_substreams

kalidasya commented 5 years ago

@ams-tschoening the casting was an oversight on my end, I moved code around and its not needed indeed.

I see your point, the problem I try to solve is that in an mpeg2 video stream there are sequences. each sequence is separated by 00 00 01 and after that comes the start code which defines what kind of data structure is following it. With that instance structure I wanted to mimic peek to see if there is a 00 00 01 in the next 24 bits from the current position and if so consume it and return one byte after that, if not read one byte and try again. Because this can be part of different types I needed to use _parent. I will try to flatten things out more maybe that helps. I really appreciate the help, I am sorry for the lame questions this is my first kaitai project.

kalidasya commented 5 years ago

@ams-tschoening I think I can slowly get what u mean. also realised that instances also move the pointer its only a lazy approach so my workaround peek functionality cannot work.

kalidasya commented 5 years ago

@ams-tschoening a long shot, but would it be possible that in the instance kaitai has an attribute peek what it would do is that it will reset the position in the io like:

        @property
        def prefix_code(self):
            if hasattr(self, '_m_prefix_code'):
                return self._m_prefix_code if hasattr(self, '_m_prefix_code') else None

            if (self._io.size() - self._io.pos()) > 2:
                self._m_prefix_code = self._io.read_bits_int(24)

            return self._m_prefix_code if hasattr(self, '_m_prefix_code') else None

will become

 @property
        def prefix_code(self):
            if hasattr(self, '_m_prefix_code'):
                return self._m_prefix_code if hasattr(self, '_m_prefix_code') else None

            if (self._io.size() - self._io.pos()) > 2:
                pos = self._io.pos()
                self._m_prefix_code = self._io.read_bits_int(24)
                self._io.seek(pos)

            return self._m_prefix_code if hasattr(self, '_m_prefix_code') else None
kalidasya commented 5 years ago

I did some hackity hack and that would solve my problem I think and I finally understood @ams-tschoening position related comments, indeed I do not need to access to any other io:

start_code_sync:
    seq:
      - id: prefix
        type: magic
        repeat: until
        repeat-until: _.prefix_code != 1
      - id: ignore
        type: b16
      - id: start_code
        type: u1
  magic:
    seq:
      - id: stuff
        type: u1
    instances:
      prefix_code:
        type: b24
        if: '_io.size - _io.pos > 2'