yeslogic / doodle

6 stars 1 forks source link

[lang-model]: `const-enum` Format (`simple`, `common`, `sem`) #192

Open archaephyrryx opened 4 months ago

archaephyrryx commented 4 months ago

While achievable through a modest amount of boilerplating with our current model, it may be useful going forward to have a first-class Format-variant for const enum values: an ad-hoc type consisting of nullary (C-Style enum) variants that are each associated with a specific non-negative integer constant.

This proposal is for something of the form:

pub enum Format {
    /* ... */
    ConstEnum(Box<Format>, BTreeMap<usize, Label>)
}

where the first argument specifies the 'raw' parsing of the integer value, and the BTreeMap encapsulates the associations between each valid integer value and the name of the variant to be constructed in-place the Value-level stand-in for the raw magic number.

To provide an easier API, we propose the following helper-function:

pub fn const_enum(raw_format: Format, variants: impl IntoIter<Item = (usize, Label)>) {
    Format::ConstEnum(Box::new(raw_format), variants.into_iter().collect())
}

Usage Example (from WIP ELF):

const_enum(base.u8(), [(0, "ClassNone"), (1, "Class32"), (2, "Class64")])

The intended interpretation of this primitive is that it should adhere to the rough operational semantics illustrated below:

Format::ConstEnum(F, Map) :~
   map(
        where_lambda(F, "x", <key "x" is in Map>),
        lambda("y", Expr::Variant(<associated value with key "y")
    )
mikeday commented 4 months ago

Could the proposed const_enum method be implemented with Map and Match now?

archaephyrryx commented 4 months ago

It would require a fair bit of boilerplate, I should expect.

archaephyrryx commented 4 months ago

But technically achievable.

archaephyrryx commented 4 months ago

Having thought about this a bit more, it may be difficult to actually get this working using existing primitives for the following reason:

Once the Map is interpreted/evaluated, the original value of the parse is lost, and so any numerically-oriented operations will fail, leaving Match as the sole avenue of value-introspection. This may be a bit annoying to deal with especially when the enum is large.

It may be appropriate to include, in any change-set that adds Format::ConstEnum, a corresponding

pub enum Value {
    /* ... */
    NamedConst(Label, Box<Value>),
}

that we can pattern-match either using the Label or the inner value, but otherwise will be implicitly converted to the inner Value prior to any arithmetic or IntRel operations.

mikeday commented 4 months ago

I'd be reluctant to add C-style enums (that can be used interchangeably with integers) without a good reason, do we have examples of enum values that we also need to perform arithmetic on?

archaephyrryx commented 4 months ago

Good question. A common pattern we might want to apply is == or != to a given Variant, in which case we wouldn't get the right behavior. This could, of course, be achieved using Match instead, but it might be worth providing some mechanism for.

archaephyrryx commented 4 months ago

(Since IntRel only works on Base-typed values, and not Variants)

archaephyrryx commented 4 months ago

We could add in a notion of structural-equality and -inequality, but that might be best achieved using Expr->Pattern conversions that then do a trivial Expr::Match

archaephyrryx commented 4 months ago

we could have a helper:

/// Expression that evaluates to `true` if the given Expr is any Variant with name `varname`, `false` otherwise.
pub fn is_variant(x: Expr, varname: impl IntoLabel) -> Expr {
    Expr::Match(x,
        vec![
             (Pattern::Variant(varname.into(), Pattern::Wildcard), Expr::Bool(true)),
             (Pattern::Wildcard, Expr::Bool(false)),
        ]
    )
}
mikeday commented 4 months ago

Yep that's a good approach I think.

archaephyrryx commented 2 months ago

Even without necessarily generating const-enum definitions in generated code, we may want to start signalling the semantics of the expected values of a given numeric format-token with strings or identifiers at the output layer; similar to the proposal for a type-preserving wrapper that merely attaches content of a given type (whether string or enum) to a Value as it is processed, we could have type-erased annotations that augment the display output without disrupting the computation model.