inkyblackness / ss-specs

Unofficial System Shock 1 Specifications
Other
24 stars 2 forks source link

Consider writing specs directly in Kaitai Struct language #66

Open KOLANICH opened 3 years ago

KOLANICH commented 3 years ago

If it is possible, this would allow them to be transpiled into a ready-to-use parser for multiple languages, including python, rust and c++. Human-readability is not expected to be harmed, since the lang is declarative.

dertseha commented 3 years ago

Hello there and thank you for your message! That's a curious idea - I did not know about Kaitai. Glancing over it, I'm a bit hesitant; for one about the last official release being from 2018 (which isn't even a v1.x), and second because I don't know to which degree it is fully capable of mapping the resource format. Most likely it would not be able to cover the resource file compression, I figure.

Do you have more experience with it? Could you provide an example on how you would expect this to look like?

KOLANICH commented 3 years ago

Glancing over it, I'm a bit hesitant; for one about the last official release being from 2018

We mostly use nightly versions, releases don't have the latest tasty features.

Most likely it would not be able to cover the resource file compression

It depends a lot.

  1. There exist standardized compressors. Mostly zlib.
  2. There exists a process interface. Own code can be plugged there.
  3. There exists an official library of compressors. For python part of it I have passed through some compression algos.

Do you have more experience with it?

I do.

Could you provide an example on how you would expect this to look like?

We have a repo for ready-to use specs.. Everything is very straightforward - if you have an uint32_t field and a string of size specified by that field, you just write

- id: str_size
  #type: u4le # endianness is usually put into  meta
  type: u4
- id: super_str
  type: str
  size: str_size
  #encoding: utf-8 # usually put into meta
winterheart commented 3 years ago

While Kaitai has excellent declarative syntax, it still don't have serializer from *.ksy into native code. So there only parsing features. LZW compression has own implementation (obviously it based on Mark Nelson's classic article "LZW Data Compression" from Dr.Dobbs Journal, but has own code for opcodes and slightly modified compression algorithm) so there need to write own plugin.

dertseha commented 3 years ago

Well, it seems applying this would need some work. The easier and more incomplete approach would be a bottom-up approach - document all the internal structures and leave only the resource extraction to the reader.

With my private time reduced to work on this project I have to stay within budget of how many unknowns to handle at once. So, all in all it's still a curious idea - and would be interested if anyone picked this up as a proof-of-concept. I'll keep it open with a mental note of "help accepted".