kaitai-io / kaitai_struct

Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
https://kaitai.io
4.03k stars 197 forks source link

List of supported encodings #116

Open GreyCat opened 7 years ago

GreyCat commented 7 years ago

Transferring question from https://github.com/kaitai-io/kaitai_struct_formats/issues/5:

https://github.com/kaitai-io/kaitai_struct/wiki/Expressions has this example:

seq:
  - id: filename_len
    type: u4
  - id: filename
    type: str
    size: filename_len
    encoding: UTF-8

Other than UTF-8, which encodings are supported? Would be useful to have this list documented somewhere :)

My original reply:

Unfortunately, there's no list of encodings per se in Kaitai Struct. Most of the time, encoding ID as passed as is to the target language, so it ultimately depends on the the target language.

For example:

It would be a good idea to make a KS "standard" of encoding names and do a hard mapping table for every supported language, but it's a fair amount of work and, as of now, it still works in 99% of cases, I guess.

GreyCat commented 7 years ago

Just for the record: I've tried to start collecting information on current situation with encodings, and, well, it's a mess and it's huge.

Here it is: https://docs.google.com/spreadsheets/d/1l87kGi9_U4Xrgaw2CGaTc9-_f5UEf1nf-68Dk_e1_iA/edit?usp=drive_web

If someone would want to continue this work, please tell me.

arekbulski commented 6 years ago

There is an issue about UTF16 UTF32 specifically, related. #187