nodejs / llparse

Generating parsers in LLVM IR
http://llparse.org
Other
584 stars 30 forks source link

Parsing binary protocols #31

Open arthurschreiber opened 4 years ago

arthurschreiber commented 4 years ago

I was looking through the code and noticed this project does not (yet) support parsing of binary protocols - is that something you want to add support for later? 🙇

indutny commented 4 years ago

Maybe...?

arthurschreiber commented 4 years ago

So I played around with this a bit more.

I added a new node type called Byte, that works like a mix between Invoke and Match - it matches any byte and then passes it on to a given code function.

It works something like this:

const p = new LLParse('example');

p.property('i16', 'value');

const uint16be = p.byte(p.code.store('value'));
uint16be.skipTo(
  p.byte(p.code.mulAdd('value', { base: 2 ** 8 })).skipTo(uint16be)
);

const artifacts = p.build(uint16be);

I've yet to figure the best way out to expose reading of values coming in in little endian order. 🤔

Anyway, is this how you'd implement binary reading in llparse? Or do you have any other ideas how this could be exposed in a nicer / more consistent fashion?

arthurschreiber commented 4 years ago

Maybe it would be better to just add two new nodes for reading signed/unsigned LE and BE encoded bytes into a field:

const p = new LLParse('example');

p.property('i8', 'type')
p.property('i16', 'value');

const type = p.uIntLE('type', 1);
const valueLength = p.uIntLE('valueLength', 2);

type.skipTo(valueLength.skipTo(p.consume('valueLength').skipTo(type)));

This could automatically check that the field is large enough to read the required number of bytes into it.

This would already cover most of my needs, and should be pretty straightforward to implement, I think. @indutny What do you think?