Open mnakamura1337 opened 7 years ago
The first major problem would be that "file" concept does not exist on some platforms, i.e. JavaScript. What do we do with that?
Parsing from multiple files
I have thought about the same for .cab format parser. I guess we don't need this built-in, we shouldn't put everything in the world in KS. We need some way to extend KS with own plugins. The discussion worths a separate issue.
The first major problem would be that "file" concept does not exist on some platforms, i.e. JavaScript
Node.JS includes tons of file reading functions. We can just skip implementation of browser-compatible code for now. You're not even testing it in browsers, as I believe.
Generally, it all boils down to creation of a new KaitaiStream instance for an instance (akin to io: XXX
) that will open and start reading from a new file. I'm not really sure about filename: ...
though. I'm pondering the idea of some factory method to create new instances of stream from files, thus we could reuse it everywhere, i.e. io: local_file("body.dat")
(name and syntax chosen totally randomly, of course).
I would second Nakamura-san's request: Adabas databases rely on multiple files, some being text, describing the fields, others binary (index and data). I still need to have a local instance of kaitai to work for me and give it a try without, but those files seem to be quite related.
I'm also interested by being able to parse multiple files in Kaitai structs for the following formats:
Is this still the case? No way to define and parse linked/connected file formats together?
@burner1024:
No way to define and parse linked/connected file formats together?
Actually, there is a way. The main idea is to pass the streams of related files via top-level type: io
parameters and then you typically use instances
with the io
key to parse anything you want from that stream (see Absolute positioning in the User Guide; here it's demonstrated on the typical use case of parsing from the root stream, but it'll work with an arbitrary stream just as well).
For reference, see https://github.com/kaitai-io/coreldraw_cdr.ksy that uses this trick. As explained in the README there, unfortunately it won't be possible to use a .ksy spec with top-level type: io
params in visualizers, because they don't provide any way to pass the streams. To work around that in coreldraw_cdr.ksy
, I wrote a simple Bash script bin/cdr-unpk
, which dumps the contents of all required files in a custom "archive" format described in cdr_unpk.ksy
, which becomes the new entrypoint. This enables full use of coreldraw_cdr.ksy
in visualizers even though it depends on external files. However, when using the generated parser in an application, there's no problem to specify values for the top-level parameters and it's of course easier than going through the auxiliary dump format like cdr_unpk.ksy
, so it's better to use the variant of coreldraw_cdr.ksy
that has the top-level type: io[]
parameter (and avoid cdr_unpk.ksy
altogether) - see Standalone use of coreldraw_cdr.ksy
from your application.
Thanks! I'll try that.
I'm working on reverse engineering of a container that essentially consists of 2 files:
The only way to access data in second file is to use the first file. However, if I'm not mistaken, Kaitai Struct does not allow to access extra files while parsing. So, I want to propose something like:
In fact, I've encountered quite a few of such multi-file formats in last few months or so. I believe it would be a very useful addition to Kaitai Struct.