teal-language / tl

The compiler for Teal, a typed dialect of Lua
MIT License
2.02k stars 101 forks source link

Allow iterating all the fields of an enum with ipairs or similar #205

Open Ruin0x11 opened 3 years ago

hishamhm commented 3 years ago

That would imply reifying the enum in the generated Lua code as an array of strings, which would have the side effect of assigning a numeric index to each entry. I can see how some people would like that, but it would also logically lead to people wanting to assign arbitrary indices to it and also for enum entries to map back to numbers (as suggested in #189) in the other direction as well... this would have a shockwave of implications in the type system, since enums would become yet another table type, effectively, and then all its interactions with other table types would have to be understood and encoded.

lenscas commented 3 years ago

maybe its useful to have 2 enum types? One that is the current one, meant if you just need to restrict strings. The other is more powerful. These will require the user to write their own function to check the current variant.

If that exists then they have the potential to do the following (depending on how far these would be developed):

Splitting enums up like this wouldn't have big effects in the type system, as it is a new type rather than changing an existing one from string to table. It also doesn't force users to always write their own check function if they just want to work with strings and it doesn't suddenly drop a lot of lua code out of nowhere as the user is forced to write the check function if he/she makes use of this new type.

euclidianAce commented 3 years ago

I'm somewhat in favor of implementing some sort of enumset where the enum just gets generated into a simple set like

local enumset Foo
   "bar"
   "baz"
end

turns into

local Foo = {
   ["bar"] = true,
   ["baz"] = true,
}

which should be real easy to generate since the strings are already checked for validity, just need to basically ('[%s] = true,'):gsub(token) which can then be used for generating better code for is where you can say stuff like

local function doSomething(s: string)
   if s is Foo then --> if Foo[s] then
      -- ...
   else
      -- ...
   end
end

(which is kind of a first step towards the custom is implementation talked about in the meetup) Often times when I'm using an enum I want a way of checking at runtime but I end up repeating myself by writing out the enum and then repeating myself to write out the set to check at runtime:

local enum Foo
   "bar"
   "baz"
end
local isFoo: {Foo:boolean} = {
   ["bar"] = true,
   ["baz"] = true,
}

This doesn't quite turn enums into a table type, but adding a way to iterate over the fields would be nice, but I can't really think of a way to do it that doesn't just look like a macro for pairs: for field in <something> do

I bring this up because trying to implement the warning-disabling flags for tl where warnings get tagged with an enum member, and i'd like an easy way to show the user each warning tag so they can put it in their config/makefile/whatever without having to repeat myself as shown above

hishamhm commented 3 years ago

Both are interesting (and very different) proposals!

@lenscas, what you are proposing is essentially a form of sum type, a.k.a. tagged union (and so many other names). I always found it weird that Rust calls that "enum" though I can squint and see why — and I think it has to do with Rust's ML origins, maybe? Adding tagged unions to Teal is still very much on the cards.

@euclidianAce, what you described could even fit nicely as an extension of the current enum type, and maybe it wouldn't even need to be a new enumset type but just make it the new version of enum?

(However, if we add literal types and turn enums into union types like local type Direction = "north" | "south" | "east" | "west", which is arguably more elegant from a language simplicity standpoint, then the ability of iterating and the simple implementation of is would be lost, if we consider more complex union types mixing literals and non-literals.)

I've been thinking about this recently. I think a key point here that you bring up as the code duplication aspect, is the fact that you want to share information between the world of types and the world of values. One could even say that iterating over the enum definition is a simple form of reflection: a similar argument could be made about iterating over the field names of a record definition.

The options you enumerated are very much the ones on my mind too: compile-time functions is the "metaprogramming" option I mentioned at the meetup; fieldsof Foo is a TypeScript-like approach (where you have things like keyof T to construct types out of other types; and here you'd want to construct a value out of a type — sizeof in C is a similar precedent); and allowing pairs on the set object means treating the type object as a Lua table, which I mentioned in my comment above, if not the instances themselves.

Right now, I think the metaprogramming option is the most powerful, especially because it can deal with future feature requests (ordered sets? mapped types?), but I can see how it can be the scariest and also turn into another fruitless rabbit hole (metaprogramming API design! we don't even have a stable AST yet).

Your enumset suggestion, possibly allowing pairs as a special case, would be the most pragmatic one. I think the only con about it is that it would make it more difficult to eventually move away from it and into unions-of-literals. I can imagine how to deal with is, since unions are already full of special cases, but "iterating on the elements", which is your motivation, could feel odd in a union-of-literals, unless it was something like "iterate on the literals of the union".

Looking at tl.tl I see we have a bunch of {string:boolean} sets that are constant (though I don't know if that's super typical — a common mistake in language design is to end up designing a great language to write compilers in :laughing: ). But I guess it could be nice to turn these into enums!

lenscas commented 3 years ago

may I suggest that it may be beneficial to not care too much about backwards compatibility? Sure, its nice but teal hasn't reached 1.0.0 yet. To me that suggest that the language is not done yet and things can and will change between versions.

So making certain types and removing or changing them in a non backwards compatible way should be expected. Wouldn't surprise me if lots of languages looked very different in their pre 1.0.0 days vs after.

Maybe its worth it to give teal a way to mark certain features as experimental and require opt in. That way its clear which features feel "done"/good enough and thus won't change and which features still requires feedback to get right, while still making it fairly easy for people to actually use it.

That should make it easier to just try features out without worrying about backwards compatibility, while the users who really don't want to deal with that have an easy way to avoid them.

Looking at tl.tl I see we have a bunch of {string:boolean} sets that are constant (though I don't know if that's super typical — a common mistake in language design is to end up designing a great language to write compilers in laughing ). But I guess it could be nice to turn these into enums

python has enum's that can map to any value (https://docs.python.org/3/library/enum.html) so that to me suggests that there is some want to enums that are not just different strings. Then again, python also contains for else loops so.....

Ruin0x11 commented 3 years ago

To elaborate on my original usecase for this feature, I wanted to create a visitor for a Teal AST by assigning a name in the AST node types enum to a callback that operates on nodes of that type. When I constructed the visitor I would iterate over the table of node names and check the passed in visitor callbacks to see if one was present under that name, to make sure it was a valid node type. I couldn't find a clean way of doing this except by copying the enum definition to a table, copying it to a set to check for validity by inclusion, or copying it to a record with fields for each enum variant.

Also, because the node types enum was located in an external library (the Teal compiler), if the "two types of enums" proposal would be used and if I wanted to use that enum for checking for validity by inclusion, then I would be out of luck if it was not defined as an enumset. Maybe that should be considered also, since other library authors might want to do different things with the same definition.

hishamhm commented 3 years ago

may I suggest that it may be beneficial to not care too much about backwards compatibility? Sure, its nice but teal hasn't reached 1.0.0 yet. To me that suggest that the language is not done yet and things can and will change between versions.

Ah, don't worry about that! My concern there wasn't really about not breaking compatibility, but really about retaining functionality when we break compatibility, trying to think two steps ahead. We have been unafraid of breaking compatibility (I just committed a breaking change today!), but at the same time I don't want to leave users "stranded" if we change a feature and it removes functionality. But yeah, on this topic, I'm now leaning more towards being pragmatic.

hishamhm commented 3 years ago

I wanted to create a visitor for a Teal AST by assigning a name in the AST node types enum to a callback that operates on nodes of that type. When I constructed the visitor I would iterate over the table of node names and check the passed in visitor callbacks to see if one was present under that name, to make sure it was a valid node type.

@Ruin0x11 just asking for one more clarification: when you mean "to make sure it was a valid node type", was your data structure indexed by strings? Could them be indexed by the enum type instead, so that there's no need to check if it's a valid node type? (Because the compiler would only allow valid enum types anyway.) — In my visitors in the compiler, for example, I use Visitor<NodeKind, Node, T>, so only valid names can be given.

if the "two types of enums" proposal would be used and if I wanted to use that enum for checking for validity by inclusion, then I would be out of luck if it was not defined as an enumset.

Yeah, I don't think we should have two types; if we go with @euclidianAce's proposal then the enum should be upgraded to the behavior he described for enumset.