ta0kira / zeolite

Zeolite is a statically-typed, general-purpose programming language.
Apache License 2.0
18 stars 0 forks source link

Add support for enumerations #205

Open ta0kira opened 11 months ago

ta0kira commented 11 months ago

Not having them will become a usability issue when trying to support C libraries and state machines.

They don't necessarily need to be numbered like the name indicates, they should be somewhat similar to enum class in C++ and enum in Java.

  1. Values should be immutable.
  2. Values should have a specific type.
  3. Functionality should be extendable.
  4. Must be easy to construct.
  5. Unlike Java and C++, shouldn't have default behavior, since Zeolite doesn't do default functionality.
  6. Unlike C++, no guarantee that all possible values are listed in the type. (For example, the type could enumerate bit flags but also allow combining them.)

This effectively means that they need to be streamlined @category variables. Also, they don't need to be called "enum"s. Maybe @static?


Maybe something like this for syntax:

concrete MyType<#x> {
  immutable

  // No default comparisons because some values could be synonyms.
  defines Equals<MyType<#x>>

  #x immutable

  // Type substitution required, since they're @category scope.
  // Unclear what the syntax should be, but putting params first makes grouping by type clearer.
  @static<Int> _VALUE1
  @static<String> _VALUE2
}

define MyType{
  @value #x value

  # Use a statement like with @category variables, but scoped within the @type.
  _VALUE1 <- #self{ 1 }
  _VALUE2 <- #self{ "VALUE2" }

  equals (x,y) {
    // See https://github.com/ta0kira/zeolite/issues/204.
    return instance(x) `Instance<any>.equals` instance(y)
  }
}

// ...

\ something(MyType:_VALUE1)

The leading _ is for easier parsing.

  1. UPPER_SNAKE would cause problems with unqualified calls, e.g., in VALUE.foo() is VALUE a poorly-named type or an @enum?
  2. Should still contain UPPER_SNAKE as a convention, so it really just needs a leading character.

Some questions:

  1. How to handle params during init? It would be cleaner to allow #self during init, but that might not work with ProcedureContext, since it assumes that in @type scope the code will live in the @type.
  2. How to specify params during declaration? It's weird to specify just the params and not the type (perhaps also misleading; @static<Int> seems like an Int value), but specifying the whole type implies that some other category can be used.
  3. Given that, maybe @static is also misleading. Maybe @enum would mean that we could enumerate the instances if we wanted to but that we're not going to.
  4. Maybe any immutable type should be allowed, but it will have the containing type by convention?

    concrete Signal {
     immutable
     @static Int _SIGTERM
     @static Signal _ALSO_SIGTERM
    }
    
    define Signal {
     _SIGTERM <- 15
     _ALSO_SIGTERM <- Signal{ }
    }
ta0kira commented 11 months ago

In the generated C++, it would be better to provide direct access rather than dispatching a function call.

// Category_Signal.hpp

// Function instead of extern so that it can be lazily initialized.
BoxedValue Static_Signal_SIGTERM();
// Category_Signal.cpp

struct Category_Signal : public TypeCategory {
  LazyInit<BoxedValue> Static_SIGTERM;
}

BoxedValue Static_Signal_SIGTERM() {
  return CreateCategory_Signal().Static_SIGTERM.Get();
}

It's unclear how to handle initialization in C++ extensions, though.

Maybe add virtual BoxedValue Static_SIGTERM() const = 0; to Category_Signal so that ExtCategory_Signal needs to define it, then call that in LazyInit? That would be better than just allowing definition of BoxedValue Static_Signal_SIGTERM() directly because it would ensure that the same value is always returned.

ta0kira commented 11 months ago

Also, call optimization will need to choose which way to access Static_SIGTERM.

Static_SIGTERM                 // from category
parent.Static_SIGTERM          // from type
parent->parent.Static_SIGTERM  // from value
Static_Signal_SIGTERM()        // everywhere else
ta0kira commented 11 months ago

I was thinking that we could just skip the leading _ (etc.) and allow an ambiguous parse, but then we'd need to handle it in at least 4 contexts:

  1. Qualified, e.g., return Foo:BAR. (New ExpressionStart?)
  2. Unqualified without a function call, e.g., return BAR or return BAR?Baz. (2nd new ExpressionStart?)
  3. Unqualified with a function call, e.g., return BAR.foo(). (TypeCall?)
  4. Unqualified with a function call and _, e.g., return BAR_1.foo(). (2nd new ExpressionStart?)

Another option is to always require it to be qualified.

ta0kira commented 11 months ago

I guess it would also make sense to allow internal values for completeness.

concrete Foo {
  immutable

  @static Foo VALUE
}

define Foo {
  VALUE <- Foo{ }

  @static Foo DEFAULT
  DEFAULT <- Foo{ }
}