kaitai-io / kaitai_struct

Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
https://kaitai.io
3.97k stars 194 forks source link

How to handle self-describing datatypes, e.g. ISO 8211 #1080

Open mlthlschr opened 8 months ago

mlthlschr commented 8 months ago

Hi,

after unsuccessfully trying to write a ksy description for iso8211 files a few years ago, I started to try it again. I was successful in writing the general file pattern, but there is still some way to go.

How do these files look like? There is one DDR (data descriptive record) as first entry and n DRs (data records).

The entries in the DDR describe how to handle the data stored in the following DRs. These entries consist of mappings of tags like SG2D to descriptions like 2-D coordinate field\u001F*YCOO!XCOO\u001F(2b24). In this example the tag SG2D describes a coordinate with y and x values, which are both of the type b24 (actually it is a bit more complex than that but should suffice for this description).

An entry in the DR contains a reference to a tag defined in DDR as well as data for that tag. To interpret the data, I theoretically have to find the tag in the DDR, interpret the type descriptions there and parse the data in the DR according to that description.

The first problem I have is how to search the array of tags in DDR for the current tag I have encountered in the DR entry. The next problem then would be how to interpret the type description taken out of the DDR and applying it to the data in the DR. Feels like some kind of dynamic typing.

Is such thing possible with ksy?

generalmimon commented 8 months ago

It appears that there have been some attempts to describe the ISO 8211 format in Kaitai Struct already, so it might be beneficial to first become familiar with them to see if they have some good parts that can be adopted:

  1. https://github.com/kaitai-io/kaitai_struct_formats/issues/139
  2. Searching for iso 8211 "ksy" description using Google also finds https://qspace.library.queensu.ca/server/api/core/bitstreams/ddf8da12-9d1c-48b0-9add-a31b8fde26b3/content, which also discusses the draft .ksy specification in the previously mentioned issue.
mlthlschr commented 8 months ago

Thanks for the hint! Looks like the code in the open ticket is a great place to start, but it seems like it only scratches the surface like I am currently doing. The dynamic type issue is not handled there. In the master's thesis it appears that they also dodged that bullet by using a custom format controls parser after using kaitai, instead ofusing kaitai for that.