WebAssembly / component-model

Repository for design and specification of the Component Model
Other
971 stars 81 forks source link

Some questions on the WIT format #154

Closed huwaireb closed 1 year ago

huwaireb commented 1 year ago

Hi,

[1] Comments

is there any reason why comments aren't treated as white-space and are as things stand, included in the AST?

I know that doc-comments are used somewhere, however cant we just get it from the CST then? Or at least exclude comments in the AST.

[2] Operators

operator ::= '=' | ',' | ':' | ';' | '(' | ')' | '{' | '}' | '<' | '>' | '*' | '->'
  1. I don't understand why are these operators grouped together, and why is (), <>, ,, {}, ;, -> operators? The only things that could could qualify as an operator here are assignment operators = and possibly :. What is an operator in this context?

  2. What is *? If we have tuple<>

[3] Reserved Keywords

  1. Is there any reason we are hard-coding builtin types as reserved types in the grammar, shouldn't that be up to a validator?
  2. Why is the keyword as treated as reserved. How its used allows it to be used as an identifier

[4] Types

  1. Is there any reason we are hard-coding builtin types into the grammar? Is there any hard-coded type with a different format than simply being an id? I don't think so? Actually, I don't know if the types with generic parameters are hard-coded yet, Option, Tuple, Record, List. However other than that none? Consider it in the context of an AST, do we need to explicitly state that a type is for example u32, Any benefit of doing so in the AST?

These builtin types are fine placed elsewhere, outside the grammar.

  1. Shouldn't unit being _ be its own type? As it stands we can't define a unit type outside a result.

I was going to open PR, however I thought it would've been appropriate to raise an issue first.

I'm not an expert of any kind in (E)BNF nor do I have experience writing an (E)BNF, so do feel free to correct me if I'm wrong.

Thank you

alexcrichton commented 1 year ago

is there any reason why comments aren't treated as white-space and are as things stand, included in the AST?

To clarify, do you mean the wit-parser crate? If so doc comments are the reason as to why, but that's just an implementation detail of wit-parser and isn't particularly part of the standard here.

I don't understand why are these operators grouped together

The term "operator" may not be correct for these. Perhaps "punctuation"? Either way it's just intended to cover the lexical tokens used by WIT. The * operator is intended to be used with "star imports" in the future.

Is there any reason we are hard-coding builtin types as reserved types in the grammar, shouldn't that be up to a validator?

Nah just it just felt conservative to make everyting a keyword. One worry is that if you do something like record u32 { ... } then there's no way to actually access the u32 type in that module. What you can do, however, is record %u32 { ... } to achieve the same desired effect.

Why is the keyword as treated as reserved

Like above just being conservative and reserving keywords where possible. Always possible to relax keywords later but difficult to add new keyword.

Shouldn't unit being _ be its own type?

Historically it wasn't. Then it was. Then it wasn't again. It's intentional that it's not a type right now.

huwaireb commented 1 year ago

To clarify, do you mean the wit-parser crate? If so doc comments are the reason as to why, but that's just an implementation detail of wit-parser and isn't particularly part of the standard here.

No, in the EBNF, whitespace is declared as the following

whitespace ::= ' ' | '\n' | '\r' | '\t'

That doesn't include comments

The term "operator" may not be correct for these. Perhaps "punctuation"? Either way it's just intended to cover the lexical tokens used by WIT. The * operator is intended to be used with "star imports" in the future.

👍

Nah just it just felt conservative to make everyting a keyword. One worry is that if you do something like record u32 { ... } then there's no way to actually access the u32 type in that module. What you can do, however, is record %u32 { ... } to achieve the same desired effect.

A validator, in for example wit-component (yes i know its in wit-parser now), should deal with builtin types as according to the spec.

Why do we have these builtin-types specified in the grammar? Of course other than what I noted.

I understand concerns in-regards to shadowing of builtin-types, though these should not be handled by reserving them in the grammar, and as I mentioned leave it as to the validator.

Like above just being conservative and reserving keywords where possible. Always possible to relax keywords later but difficult to add new keyword.

Yeah, makes sense.

Historically it wasn't. Then it was. Then it wasn't again. It's intentional that it's not a type right now.

I would love to know why it isn't a type now, is there a justification for going as far as to hard-code a type into a single HKT?

alexcrichton commented 1 year ago

The whitespace vs comments is probably just a mistake somewhere, feel free to send a PR!

As for types being builtin vs in grammar vs validation, what's here is just something I thought of originally. So long as everything works it's fine and it should be ok to shuffle things around as necessary.

ribosomerocker commented 1 year ago

I personally agree about unit needing to be given it's own type, it's simple enough and for the cases it makes everything neater, it's nice to use.

huwaireb commented 1 year ago

In-regards to the Unit type, for those interested, this should be undertaken in another PR. with #156 most questions are addressed and the problems resolved, except one

operators, I don't know how to go on from here on that end so i'll just close this PR alongside #156 .