Closed montymxb closed 1 year ago
Couldn't agree with you more, as a new user, it took me an hour to figure this out. But I still don't quite get the difference between types and rules.
@luan-xiaokun thanks for chiming in! It's of particular interest to make sure that new users are able to pick up Langium quite easily 🙂 . We will also be updating our documentation to hopefully make this clearer going forward as well.
If you haven't read it yet, we do have an entry on Semantic Model Inference that goes more into how Types can declared, or inferred from rules. That may help clear up the difference between the two.
@montymxb I don't get the problem. Isn't the type of ID
in your example above exactly string
?
@spoenemann the type of ID
is string in this case. The problem is we useSomeDataType
as a new type, and it's not the same as string
(even though internally this is a type alias to string). This effectively creates a subtype, but the wording here seems to suggests it's just a string
, which does not reflect this relationship.
Currently, the inferred type system will resolve DataTypes as they are, not the type that they're based on. So although it is technically a string, it has it's own distinct subtype.
It's also confounded by the fact the keyword returns
has a different meaning on where we use it (Parser rules vs. DataType rules). It makes sense for the former case, because we're returning a value of that type, but in the 2nd case it's not the same.
Ok so you mean something like this?
// Declared
type SomeDataType = 'a' | 'b' | 'c'
SomeDataType returns SomeDataType: 'a' | 'b' | 'c';
// Inferred
SomeDataType returns string: 'a' | 'b' | 'c';
The declared variant looks good to me. The inferred variant is indeed a bit different from the other usages of returns
. But it's not wrong: the value returned by the data type rule is a string
. The inferrer only creates a subtype of string
when the value is an alternative of string literals. This does not apply to this declaration, for example:
SomeDataType returns string: 'a' | 'b' | ID;
Of course the relationship between grammar declaration and generated TypeScript declarations is more hidden if you use inference. But that's why we introduced type declarations in the grammar. Those who prefer things to be explicit can simply choose the variant at the beginning of my post.
@montymxb Thank you for your advice, the document is very helpful to me, I'm using langium to implement a vscode extension for a modeling language of our own design
Following up, we had some internal discussions last week about this. We haven't 100% decided this, but we're leaning towards keeping the returns
syntax for DataTypes, but making sure the DataType correctly returns the declared type (removing the notion of a sub-type here). This would remove the ambiguity reported here about what is actually returned, and keeps some other parts of Langium working nicely as well.
We release the current syntax as v1.0; we should keep it unless we find good reasons to change it.
While discussing some details yesterday over another project, it came up that the form of writing a DataType rule could mislead new users into thinking it returns a primitive type, rather than the DataType itself. I believe this goes against the notion of having a low barrier to entry for newcomers, as it could confuse and frustrate them.
For example, the following produces a value of the type
SomeDataType
, but is in itself a subtype of string. To make this work, we write 'returns' like so.For somebody unfamiliar with Langium, they may come to believe that SomeDataType actually returns string, given the literal wording. This is further reinforced by our consistent usage of 'returns' elsewhere for parser rules & terminals, where this is always the case.
My suggestion is that we at least consider whether we should:
returns type
, and being able to infer the super type from utilized keywords, terminal rules, and data type rules