Issues with presentation grammar in `README.md`

david-bakin commented 3 years ago

I'm new to aa - starting by reading the README.md. I have the following issues/questions with the grammar described there:

Error in term = tfact bop- stmts bop- stmt - that first bop- should be bop+?
Possible error in term = tfact bop- stmts bop- (prior line): wouldn't that admit an expr (therefore stmt) where the last symbols are [:]= (per definition of bop- = ] ]= ]:=)? bop- should be factored into a sole ] and a bop-compound = ]= ]:= or actually better simply make term = tfact bop- stmts bop- [:]= stmt as the rule for arity-3 balanced/split operator.
For readability in tfun = {[[type]* ->]? type } introduce a space after the { (this grammar notation is already a bit confusing - though nicely terse! - w.r.t. terminal symbols, especially [ and ] which are significant tokens in both the grammar and in the target language). Similarly for the ( in the ttuple rule and the } in the tstruct rule.
Style suggestion? I'm uncomfortable with seeing ifex = apply [? stmt [: stmt]]. It seems to "promote" the "trinary logic" expression to a level of importance even "higher" than apply. (Not sure quite how to say this, but I'm thinking in terms of the way a user might read the grammar, rather than how an implementer would want it to produce a certain parse tree.) My suggestion would be to factor that rule into 3 rules (similarly to the way the balanced/split operator is factored, e.g.,
```
ifex = apply
ifex = apply ? stmt
ifex = apply ? stmt : stmt
```
(And if you do that perhaps there's a better name than ifex which IMO enhances the "importance" of the "trinary logic" expression in the eyes of the reader even more than the grammar rules ...)
Style approval: I especially like expr = term [binop term]* with the comment "gather all the binops and sort by prec" - exactly! (And same for unary ops.) A shame C didn't do this and C++ followed - embedding binary operator precedence (and in some cases associativity) in the presentation grammar is a mistake IMO, just massively complicates readability of the presentation grammar, and if you have a lot of built-in operators (e.g., C, C++) you need a table for the reader anyway, and then so as long as you have precedence (and associativity) in a separate table - use that table as the definition, not the grammar. (I'm making a distinction between the "presentation" or "reference" grammar and whatever grammar the tooling uses internally which can be different as long as ultimately the tooling accepts the exactly the same language.)
By the way, speaking of associativity, looking ahead to the examples for short circuit operators you have a note about relative precedence of && and ||. In my long experience doing code reviews and troubleshooting production problems on code that previously (may have) gone through code review: Unparenthesized expressions involving a mix of && and || frequently do not do what the programmer thought and most often aren't exhaustively tested in a way that would reveal their flaws. Leaving it to the sad guy trying to solve a ticket. I would strongly suggest that the precedence rules place && and || on the same precedence level and make mixes of those operators have no associativity! Thereby forcing the programmer to either split the mixed expression (probably a good idea) or at minimum add parenthesis. (I'm not convinced by arguments that suggest that "sum of products" is a well understood convention, or anything of that sort.) My opinion, FWIW, but I consider it important in a language aimed at systems programming and low-level performance code.

cliffclick commented 3 years ago

Error in term = tfact bop- stmts bop- stmt - that first bop- should be bop+?

Yes. Fixed.

Possible error in...

I don't think so. A bop- is NOT a : - no leading : allowed. A bop- can be a ] or ]= or ]:=.

For readability in tfun = {[[type]* ->]? type } ...

Done!

Style suggestion? I'm uncomfortable with seeing...

Done!

Style approval:...

Thanks! And yes, literally there is a table in the implementation (not presented here, but probably should be). Mostly C precedence anyways, and not enough operators implemented to bother beyond that.

By the way, speaking of associativity, ...

I, too, have seen this before. I don't know what the right answer is either, and I kinda like the requirement for parens - maybe even a hard requirement - syntax error if ever mixing && and || in an expression.

Results pushed to master, let me know if they help! Cliff

david-bakin commented 3 years ago

Commenting again (*) to reconsider the last point:

term = tfact bop+ stmts bop-
...
bop- = ] ]= ]:=

So foo[a]= is a term (as is foo[a]:=) thus it is a valid expr ⇒ apply ⇒ ifex ⇒ stmt ⇒ stmts ⇒ prog, and thus the following is a valid program, according to the README's grammar:

foo[a]=

(*) guess you can't reopen an issue by adding a comment ...

cliffclick commented 3 years ago

Gotcha. Still not sure if this is a grammar bug or not; foo [ a ]= will parse as you say, except that _ [ _ ]= _ is a 3-argument call with only 2 arguments. The parser will kick it out for missing the last argument... which is the same error you'd get if I changed the grammar. Easy enough to change the grammar, but I think there's no code change here.

david-bakin commented 3 years ago

Well, the issue for me as I'm going through README.md is - at this stage (and perhaps forever, don't know what you have in mind) the language definition for newcomers to it or old hands alike is awfully reminiscent of the J language spec/tutorial booklets written by Iverson and Hu (and I guess others). Language documentation is terse to the extreme because the technique is to demonstrate/teach/specify the language by example with minimal use of human language (i.e., English) to clue in the reader. I kind of like it as it puts a premium on the reader actually studying/learning something by working out the examples (IOW, he's got to have some skin in the game, it's not just spoon-fed to him). But it does mean that what's there needs to be correct and familiar - in this case the reader, seeing a detailed grammar, assumes that it really describes the syntax tightly (the way it does in other languages), otherwise with the terseness and lack of explanation the reader is confused/frustrated/misled. (By "reader" I mean "me" but of course there will be other readers too some I'm allowing myself the generalization ...)

In this case I could suggest:

bop+ = [
bop- = ]
aop  = [:]=
# then:
term = tfact bop+ stmts bop-
term = tfact bop+ stmts bop- aop stmt

(And again, it doesn't matter whether the actual parser uses these rules as factored this way or not, so, yes, probably no code change (as long as the error message is reasonable).)

Speaking of understanding the language documentation: I have a slew of comments/typos that hurt readability/questions on the many examples presented after the grammar. What's the best option for bringing them to your attention? "Issues" seem like a poor fit, yet it's all that you seem to be supporting at this time. Do you mind issues like this one that ask questions about the docs and suggest improvements to them? (I'm very hesitant to submit an actual pull request for docs as I don't know what you really have in mind and I prefer to err on the side of thinking I'm not understanding rather than you're wrong ...)

cliffclick commented 3 years ago

Well, the docs are what they are due to the usual time constraints. They started out as some quick notes to myself, the slightly more extended notes to myself, then some examples (of what ought to work, or might work). I was busy implementing grammar, typing, semantics, etc... then about a year ago I took a hard left turn into the Type Theory Pit of Despair which I am just now climbing out of. Haven't touched the docs in a year. So comments are welcome and expected, I don't really care how they are delivered. Notes in this issue are fine, so is a pull.

david-bakin commented 3 years ago

ok, will come back tomorrow. obvs I understand about "docs are what they are" but at this point this is the only thing I can contribute - if you consider it a contribution to cause work for you ... more substantial efforts later I'm hoping ...

cliffclick commented 3 years ago

Yes, docs are a contribution! One less thing to think about. Bring 'em on.

cliffclick / aa

Issues with presentation grammar in `README.md` #15