Open KOLANICH opened 4 years ago
Hello there and thank you for your message! That's a curious idea - I did not know about Kaitai. Glancing over it, I'm a bit hesitant; for one about the last official release being from 2018 (which isn't even a v1.x), and second because I don't know to which degree it is fully capable of mapping the resource format. Most likely it would not be able to cover the resource file compression, I figure.
Do you have more experience with it? Could you provide an example on how you would expect this to look like?
Glancing over it, I'm a bit hesitant; for one about the last official release being from 2018
We mostly use nightly versions, releases don't have the latest tasty features.
Most likely it would not be able to cover the resource file compression
It depends a lot.
process
interface. Own code can be plugged there.Do you have more experience with it?
Could you provide an example on how you would expect this to look like?
We have a repo for ready-to use specs.. Everything is very straightforward - if you have an uint32_t field and a string of size specified by that field, you just write
- id: str_size
#type: u4le # endianness is usually put into meta
type: u4
- id: super_str
type: str
size: str_size
#encoding: utf-8 # usually put into meta
While Kaitai has excellent declarative syntax, it still don't have serializer from *.ksy into native code. So there only parsing features. LZW compression has own implementation (obviously it based on Mark Nelson's classic article "LZW Data Compression" from Dr.Dobbs Journal, but has own code for opcodes and slightly modified compression algorithm) so there need to write own plugin.
Well, it seems applying this would need some work. The easier and more incomplete approach would be a bottom-up approach - document all the internal structures and leave only the resource extraction to the reader.
With my private time reduced to work on this project I have to stay within budget of how many unknowns to handle at once. So, all in all it's still a curious idea - and would be interested if anyone picked this up as a proof-of-concept. I'll keep it open with a mental note of "help accepted".
If it is possible, this would allow them to be transpiled into a ready-to-use parser for multiple languages, including python, rust and c++. Human-readability is not expected to be harmed, since the lang is declarative.