inkytonik / sbt-rats

sbt-rats provides a plugin that enables the Rats! parser generator to be used in Scala projects.
Other
1 stars 2 forks source link

Rules should support generic types #6

Open inkytonik opened 4 years ago

inkytonik commented 4 years ago

Nate Nystrom reported:

A rule such as the following produces a syntax error trying to parse the type:

Program : List[Cmd] = Cmd ++ ";".

It would be nice if sbt-rats were extended to allow generic types, particularly List and Maybe.

inkytonik commented 4 years ago

Thanks for the report Nate. Can you clarify what the issue is? The current behaviour is to wrap things like collections of repetitions or options for optional constructs. I.e., if you do

Program = Cmd ++ ";".

you'll get something like

case class Program(cmds : List[Cmd]) extends ASTNode

assuming that you've selected lists for the repetition type.

Is the wrapping the issue?

Note that wrapping is needed in cases where there are multiple alternatives since you need to be able to distinguish between them, so I think an unwrapped approach could only apply to non-terminals with one alternative.

It might be possible to have lists etc as the types of a non-terminal, but as you note, it would come at the cost of having to add support for generic types in the syntax definition notation. Currently we get by just with simple type names.

A new approach would also have to be generalised to cases where there are multiple elements on the RHS, such as:

Program = (Decl ** ";") (Cmd ++ ";")

Currently this just makes two fields in the Program case class, but if we try to do it the other way, we'd need a tuple of lists. This approach would be possible, but e wrapping approach seems simpler and is consistent with the multiple alternative case.

Also, in my experience for later processing you actually want a designated node for concepts such as "Program" rather than represent them as a list directly. It is easier to write safe attribution, for example, if Program is a designated type.

inkytonik commented 4 years ago

Nate says:

The Program/Cmd example just happened to be the first production in my code that had a list, and I would actually wrap the list in this case as you suggest.

But, yes, the wrapping is the main issue. Modifying your example somewhat, I'd like to write:

Program = Decls Cmds.
Decls : List[Decl] = Decl ** ";".
Cmds : List[Cmd] = Cmd ++ ";".

And get the AST:

case class Program(decls: List[Decl], cmds: List[Cmd]) extends ASTNode

rather than having to write:

Program = Decls Cmds.
Decls = Decl ** ";".
Cmds = Cmd ++ ";".

with the resulting AST:

case class Program (decls : Decls, cmds : Cmds) extends ASTNode
case class Decls (optDecls : List[Decl]) extends ASTNode
case class Cmds (cmds : List[Cmd]) extends ASTNode

I know I can avoid the wrapping by inlining the ++ and ** terms into the Program rule, but sometimes it's clearer or more convenient to have separate productions for the lists.

inkytonik commented 4 years ago

Ok, thanks. That's clear for me now. I would like to avoid wrapping in some cases, but I haven't come up with a general solution that doesn't make some other things worse. Will think further...