Open lucaswiman opened 7 years ago
Hi, Lucas. Thanks for getting in touch! RR diagrams are a neat idea, and I think they could be a selling point of the package, so let's see what happens if we do it internally. As long as we have good test coverage and don't double the size of the codebase, I'm pretty happy.
I took a quick glance at syntrax and see that it has a GTK dependency for which there's no proper Python package, so that could be one thing to look into. I haven't done much GTK and so don't know how much of a problem that is.
A Sphinx plugin would be sweet.
OK, thanks! I'll start looking into this more, and I'll submit a PR if/when I make progress.
syntrax
makes absolutely lovely diagrams, and is a big pain to install.sqlite
has very nice looking diagrams, and uses tcl. See here for information on the script, and here for evaluating tcl source directly in python using tkinter. This is likely a good option, since tkinter is part of the standard library, and sqlite has a very permissive license. Working POC here. Note this requires ghostscript & Imagemagick@erikrose I'm making a fair amount of progress with this. One thing I'd like to do is make the iteration order of Grammar.items()
the same as the ordering the rules appear in the original grammar. My current solution just re-parses the grammar and visits the rules to get the ordering, but it'd be nicer to make Grammar
inherit from OrderedDict
. Unfortunately, that would either:
ordereddict
.Implementing (2) isn't horribly difficult, but would you be OK with dropping python 2.6 compatibility? Its EOL was more than 3 years ago, and many major packages like disutils and django have dropped support for it.
Sorry for the delay. Of the 3 libs you found (if you want an opinion), I lean toward railroad-diagrams
: easy installation, no external binaries to require, SVG output for the at-the-moment-dominant web platform, and (to me) pretty output.
Given that the railroad-diagrams code itself is 1000 lines—and that's before anything you add—I reconsider whether we should roll this into Parsimonious proper, which is only 1456 lines altogether. Shall we shoot for external and just make sure Parsimonious has nice, stable interfaces for it to hook up to? I'm still very interested in showing it off on Parsimonious's docs and using it as a selling point, but this way we have the best of both worlds: lightness for those who want it and power for those who want.
I'm up for dropping 2.6 support and backing Grammar with an ordereddict. It's 2017, after all.
I've submitted a PR to railroad-diagrams adding a setup.py file and fixing python 3 compatibility: https://github.com/tabatkins/railroad-diagrams/pull/44 The author was pretty receptive to a previous PR I submitted, so hopefully that'll go smoothly.
Shall we shoot for external and just make sure Parsimonious has nice, stable interfaces for it to hook up to?
The interfaces of both railroad-diagrams
and syntrax
are pretty similar to each other, and to parsimonious.expressions
(some of the classes even have the same names). So an "interface" would consist of a mapping of diagram elements to parsimonious expressions, with a bit of glue code and special-casing. Putting that into the calling code could be pretty ugly.
It's actually not that much code, so having railroad-diagrams
as an optional dependency for the module seems reasonable. I think it might make sense to support both railroad-diagrams
(which has easy installation, but makes less appealing diagrams) and syntrax
(which has a byzantine installation process, but makes top-notch diagrams).
The API I'm thinking of is roughly the following:
def convert_grammar_to_diagram(grammar:Grammar,
collapsible=():Sequence[str],
engine='railroad_diagrams'): -> OrderdedDict[str, bytes]
"""
Return an OrderedDict mapping rule names to bytes objects containing a
diagrammatic representation of the rule.
Args:
grammar: ...
collapsible:
A collection of rule names where references can be collapsed.
This can be useful for hierarchical grammars, where the diagram
of most rules are just a straight line or disjunction. Including
the diagram of the referent rather than the reference can show
larger structural elements, or eliminate rule names which are only
included so they can be visited.
engine:
The rendering engine to use for the images. Either "railroad_diagrams"
to generate a simple, portable SVG representation of the diagram, or
"syntrax" to generate a high-quality PNG laid out by the cairo layout
engine. See the respective packages for installation instructions.
"""
I think it might make sense to support both railroad-diagrams (which has easy installation, but makes less appealing diagrams) and syntrax (which has a byzantine installation process, but makes top-notch diagrams).
Having looked into how difficult it is to get this running in a virtualenv even when installed on the system python, I have thought better of including support for syntrax
.
Sounds good to me. :-) Thanks for the update!
I saw a genomics library called
hgvs
which has really awesome documentation for its grammar using "railroad diagrams": http://hgvs.readthedocs.io/en/0.4.x/grammar.html (similar to the grammar diagrams in SQLite's documentation). They describe the mechanism for generating theirs as a "fragile hack", but it looks like there's pretty good library support now in the syntrax library.It would be really nice to have a way of generating diagrams like this given a parsimonious grammar. I'd be interested in implementing integration with syntrax, and possibly a sphinx plugin.
@erikrose: Do you think that makes more sense as a submodule of
parsimonious
(with an optional requirement ofsyntrax
), or as a separate library? Keeping it inside parsimonious means that it would stay compatible with changes, but be additional maintenance burden. If you'd prefer it as a separate library, please feel free to close this issue.