Open mingodad opened 1 year ago
Well, to be honest, I am not very clear on what benefit there is in having the grammar in this format. (I don't mean to say that in any aggressive way, mind you. I just don't know really.) I would say that, if it is important to you to be able convert existing Congo grammars to this railroad diagram format, it's probably not very hard to do. The way I would do it is just to use a visitor pattern.
There is a perhaps useful example of this, but it is not in this project (or well... it actually is the same project! :-)) here: https://github.com/javacc21/javacc21/blob/master/src/java/com/javacc/output/congo/SyntaxConverter.java
Well, there may be a bit of confusion here. You see, CongoCC is a rebranding of JavaCC 21. JavaCC 21 was work on that old JavaCC thing from Sun, eventually a complete rewrite, but one difference between Congo and JavaCC 21 is that JavaCC 21 still supported the legacy JavaCC syntax, but Congo doesn't support it any more. So I wrote a syntax converter, with some glitches, but basically works to convert existing grammars to the newer streamlined syntax. But you can see why this syntax converter is part of JavaCC 21, but not Congo. The Congo parser doesn't know anything about the legacy syntax! (Oh, and by the way, you could also infer (correctly) from all this that CongoCC is really a much more mature tool/codebase than the age (or number of commits) on Github might suggest.)
But anyway, in principle, you could use the same basic structure to convert to this railroad diagram syntax. I would guess that it's a mini-project of rather moderate scale.
I guess, if you wanted to write such a converter, I would have no problem rolling it into the tool in case some people want that. I guess it could just be invoked something like:
java -jar congocc-full.jar railroad-diagram <grammar-filename>
And it could output the grammar in that syntax. So, that's something to consider.
In terms of things a bit along these lines, what I think could add some real value now would be to have some tooling, like maybe in Eclipse or Intellij, just typical stuff like syntax highlighting and point/click navigation, like control-click on a non-terminal and jump to where it is defined, that kind of thing. The thing is that that kind of GUI programming is not my specialty and I'm just too occupied with the core tool anyway. Well, I just throw that out there, just in case...
I'll close this message here.
Thanks for reply ! For me the railroad diagram is a tool to show the whole grammar for documentation and debugging purposes in an automated way.
It's not easy to have that global vision with the extra attributes and embedded code.
Like to show to potential users the whole grammar accepted by this tool .
Cheers !
Actually, I wrote my last response before really exploring it that much. I have to admit that I hadn't followed your full instructions of pasting the converted grammar into the web interface and seeing the graphical representation. So, I actually understand the thing better than when I responded. At least I understand the motivation. Before that, I didn't really understand the value of converting the grammar to the other format. I guess I answered too quickly, though my comments about how to implement this if you really want it, that does stand.
Though, one problem with it is that one is throwing away information. Possibly, anyway. If there is a predicate expressed at some juncture in Java code, it isn't reflected in the generated diagram. Nor are contextual predicates. By the way, do you know of any other parser generator that has that feature? (Granted, if you use contextual predicates, you no longer have a context-free grammar!) But again, generally speaking, the generated diagram does not embody the full information to reproduce the working parser. (But if that is not its purpose....)
The graphical representation is nice. I actually do see some use for it in my own documentation efforts. So, if I sounded a bit dismissive in the previous message, I am somewhat more interested now.
I don't know if you ever used the old legacy JavaCC. That had (and has) this tool called JJDoc. It generated an HTML with hyperlinks to show the structure of a grammar. After picking up the JavaCC code to work on, I eventually threw away that JJDoc thing. I didn't feel it was very useful or appealing. Not in that state, without some significant additional work. The idea is okay, of course, but it generated such horrendously ugly pages that were not configurable, and... Actually, you can see what the output of JJDoc looks like here. On that page, just scroll down to where it has the heading Non-Terminals and you see what I'm talking about. I finally threw that JJDoc thing away because it felt like a choice between actually making the thing presentable (which is not my forte!) or just dropping it and maybe possibly re-implementing the thing later in a better way. So I choice the latter.
But these railroad diagrams are much sexier than that.
@mingodad a converter that generates Documentation using https://mermaid.js.org/syntax/flowchart.html would be interesting for me.
@revusky thanks for reply again the railroad diagram generator also perform some simplifications/optimizations to the grammar that sometimes help tidy up the original grammar.
And you are right the purpose is not to translate the grammar to be able to produce a fully working parser in EBNF
understood by https://www.bottlecaps.de/rr/ui (although if it happens to be feasible it doesn't hurt).
@stbischof I'm not sure I understood your request/comment could you provide a simple grammar manually converted to the mermaid
syntax to show your point ?
Its just the idea to Switch the way to handle it. Not using the given rr diagram library that expects to get ebnf But using an other chart engin like Mermaid that could generate rr Diagramm with flowCharts.
And then generate Chats using converter
I just noticed this thread. If I may jump in, as it turns out I've had a notion on my back burner regarding railroad diagrams for some time. In particular, for the COBOL compiler world (and SQL), railroad diagrams are de rigueur. So consequently I have thought that it would be nice to be able to automatically generate documentation in this form (being very lazy) directly from the grammar. Originally I looked at driving a converter off of the JavaCC/JTB source of my grammar, but it was so filled with grammar quirks, artifacts, and semantic lookahead that I decided to defer further attention until either inspiration struck or a miracle happened. It's now clear I was waiting for CongoCC to happen for this and several other things I had mentally filed away.
One of the things I was familiar with back then was the H2 database project. I, in fact, use a portion of it to implement an indexed (key/value) file organization required by COBOL. One of the things that H2 has is a really cool railroad diagram with cross links, coloring, documentation, and code examples. It also can switch to and from a BNF view of the various elements. I wondered how they produced it, and found that they had, as part of H2, a complete capability of converting a csv file of syntax elements into a BNF grammar that could be visited and produce the railroad diagram as styled HTML. That was wonderful (as it is licensed under MPL 1.0), except I didn't have my grammar in csv form, nor could I easily produce it from my then javacc grammar.
However, since you are exploring this, I thought I would provide this in case it is useful (I had also looked at the bottlecaps project, but you already know about that). This is the csv file they use for their SQL documentation. If you look around the project you will find the BNF converter, visitors, and other components (mostly in the doc packages). I would encourage you to check out the output of the process here . If I were doing this for my grammar, I would probably try to figure out how to include the out-of-band information (such as documentary and examples) via something like special comments in the .ccc file (sort of like javadocs) which would include a way to omit and/or rename productions whose nonterminal name is inappropriate (document aliases, effectively). Via coloring and the ability to document in-line, I think the problem with semantic predicates, assertions and such could be mitigated. On the other hand, having so much stuff embedded in the grammar might make it inscrutable for purposes of actual development; I don't know.
Anyway, that is about all I know about this topic. I just thought I would dump it here in case it was useful. If somehow there does come to be a tool for this in some form, I'll definitely try to use it.
Further thought about this leads me to believe that Congo's ability to define abstract nodes give us the ability to include both structured comments (similar to Javadocs) and actual visitable properties that are probably crucial to having the ability to include information necessary to produce syntax diagrams mechanically that would be useful to an end user. I can see how we could actually get to the point that the grammar could become the documentation for the language (syntactically speaking, of course).
I've set up an experimental service at https://ccc2ebnf.red-dove.com/ for converting CongoCC grammars to EBNF as used by https://bottlecaps.de/rr - the underlying functionality runs successfully through all the main CongoCC example grammars, but the web interface doesn't allow processing of INCLUDE directives (you should be able to add the lexer directives inline). It's in a very early pre-alpha state, so expect a few hiccups, but you are welcome to try it and give feedback at https://github.com/vsajip/ccc2ebnf/issues/new - I look forward to hearing from you!
Actually I'm also working with an Yacc/Lex
compatible online editor/tester here https://mingodad.github.io/parsertl-playground/playground/
I thought the point was to convert CongoCC source grammars? I don't see CongoCC in your drop-down list - is that because this functionality is not yet available to you? For the purpose mentioned in the original post of this issue (get railroad diagrams from CongoCC grammars) the site I linked to seems to fit the bill.
Would be nice if this parser generator could generate an EBNF compatible with https://www.bottlecaps.de/rr/ui to generate railroad diagrams from the grammars.
There is also an online parser generator https://www.bottlecaps.de/rex/ and an online converter https://www.bottlecaps.de/convert/ for several grammar formats.
I did some work with some parser generators here https://github.com/mingodad/lalr-parser-test , https://github.com/mingodad/CocoR-Typescript .
For example a quick and dirty manual transformation of the LUA grammar is shown bellow:
Copy and paste the
EBNF
shown bellow on https://www.bottlecaps.de/rr/ui on the tabEdit Grammar
the click on the tabView Diagram
to see/download a navigable railroad diagram.And here is an unfinished congocc EBNF manual transformation: