yeslogic / doodle

6 stars 1 forks source link

[lang-model]: Format primitive for working with buffer-offsets (`simple`, `narrow`, `expr`) #195

Closed archaephyrryx closed 4 months ago

archaephyrryx commented 4 months ago

In formats like ELF that use absolute (or rather, start-of-buffer relative) offsets to designate where certain blocks of data are to appear within a buffer, there is a need to provide some way of either parsing something at an offset relative to some other location than 'where we are right now/. The easiest way of doing so would be to allow for a Format primitive that parses zero bytes of input but returns an appropriately-wide unsigned integer type indicating how far into the buffer we are at the site where it was trivially parsed.

Prospectively, we could call this Format::Pos, and use it as follows, with the accompanying expected results of parse-evaluation:

decode_buffer_as_format(tuple(vec![Format::Pos, ...]), buf); // Pos should yield  0
decode_buffer_as_format(tuple(vec![Format::Byte(...), Format::Pos, ...]), buf); // Pos should yield 1

As Format::Pos is specifically designed to consume zero bytes, it is idempotent, and repeatedly calling it without having read any data in between calls should always yield a consistent answer. Calling it with no preceding formats at the start of the buffer should yield 0, and otherwise it should yield the number of bytes subsumed by all previous format-tokens up until that point.

Alternatively, we could approach the design with a model of

Format::BindOffset(Label, Box<Format>)

where we expect the following behavior:

Decode(BindOffset(varname, inner))
==
Decode(
    map(
        record(
            vec![(varname, Format::Pos), ("evaluated", inner)]
        ),
        lambda("x", tuple_proj(var("x"), "evaluated"))
    )
)

And eschew the need for an explicit Pos token by either fusing its binding and usage, or creating a persistent binding by applying it in the following fashion:

Format::BindOffset("x", Compute(var("x")))

which would emulate the behavior of the Format::Pos defined in the previous half of this proposal.