TranscryptOrg / Transcrypt

Python 3.9 to JavaScript compiler - Lean, fast, open!
https://www.transcrypt.org
Apache License 2.0
2.86k stars 215 forks source link

Compatibility with Parsimonious PEG generator? #396

Closed Michael-F-Ellis closed 6 years ago

Michael-F-Ellis commented 7 years ago

I'd love to be able to use Erik Rose's Parsimonious PEG generator with Transcrypt to do DSL processing in the browser. I've just done a musical notation entry project with it (see https://github.com/Michael-F-Ellis/tbon) and found it effective and easy to use in creating a parser for the notation language.

Parsimonious doesn't seem very dependent on a lot of Python libs so that may be a hopeful sign. I've filed an issue there to try and get some discussion going between the two projects.

(see https://github.com/erikrose/parsimonious/issues/130)

JdeH commented 7 years ago

Hi Michael,

I haven't yet taken the time to look thoroughly at this (will do as soon as I find the time), but reading the discussion at https://github.com/erikrose/parsimonious/issues/130 I thought it would help to remark that Transcrypt actually has a fairly complete re module.

Should anything specific be needed, I'd like to hear. I cannot promise that it will be in right away since I didn't write the re module myself (if I remember well a.o. @RCL-Carl did a lot of work on this), but it probably could be done, either by me or someone else.

Another question: Would it be too restrictive to turn your music notation grammar into an LL1 (each construction starting with a keyword or keysymbol (key in grammar sense, not music key)) grammar and parse it with a simple recursive descent parser (which doesn't need any libs at all)? The nice thing about LL1 grammars (like Pascal's) is that they can be parsed quick and simple and the error reports are allways spot on, since the parser knows exactly what to expect and can pinpoint where things go wrong. Given the grammar, recursive descent parsers are easy to write by hand. Designing an appropriate grammar is the hardest part, but for music notation it looks doable to me.

KR Jacques

Michael-F-Ellis commented 7 years ago

Would it be too restrictive to turn your music notation grammar into an LL1?

Good question to which I don't know the answer. Here's the grammar I'm using with Parsimonious. Are you able to determine by inspection if it's LL1?

   grammar = Grammar(
        """
        melody = (comment / bar)+ ws*
        comment = ws* ~r"/\*.*?\*/"s ws*
        bar = (ws* (meta / beat) ws)+ barline
        meta = key / tempo / relativetempo / velocity / de_emphasis
        key = "K=" keyname
        keyname = ~r"[a-gA-G](@|#)?"
        tempo = "T=" floatnum
        relativetempo = "t=" floatnum
        velocity = "V=" floatnum
        de_emphasis = "D=" floatnum
        floatnum = ~r"\d*\.?\d+"i
        beat = subbeat+
        barline = "|"
        extendable = chord / roll / ornament / pitch / rest
        pitch = octave* alteration? pitchname
        chord = chordstart pitch pitch+ rparen
        chordstart = "("
        rparen = ")"
        roll = rollstart pitch pitch+ rparen
        rollstart = "(:"
        ornament = ornamentstart pitch pitch+ rparen
        ornamentstart = "(~"
        subbeat = extendable / hold
        rest = "_" / "z"
        hold = "-"
        octave = octave_up / octave_down
        alteration = doublesharp / sharp / doubleflat / flat / natural
        doublesharp = "𝄪" / "##"
        sharp = "♯" / "#"
        doubleflat = "𝄫" / "@@"
        flat = "♭" / "@"
        natural = "♮" / "%"
        octave_up = "^"
        octave_down = "/"
        pitchname = ~"[a-g1-7]"i
        ws = ~r"\s*"i
        """
        )
JdeH commented 7 years ago

I may have overlooked something, but I think this grammar is indeed LL1, provided you use a "greedy" scanner to distinguish "(", "(:" and "(~"., prior to parsing. But that isn't any different from distinguishing e.g. "<" from "<=" in a prog language. Using "(", "[" and "{" instead would make scanning trivial, but you may not be willing to change your grammar due to invested notation work?

A LL1 recursive descent parser would turn your musical notation source into a tree that you can then generate code from by traversing it left to right, depth first using the visitor pattern (but you probably already know that).

Michael-F-Ellis commented 7 years ago

Thanks, Jacques!

I want to keep [] and {} in reserve for future use in case I find I've painted myself into a corner with respect to polyphony or lyrics or chord symbols or ... Also, there's a little bit of an aesthetic decision in my choice to use a decorated '(' for the rolls and ornaments. I think it looks better and is, perhaps, a bit more mnemonic for a new user.

With respect to this being an LL1 grammar, does it make any difference that pitch = octave* alteration? pitchname

i.e. a pitch can look like any of these: c, @c, /@c, ^@@c, ///#c, ...

and that any sequence of such variants can be run together undelimited in a beat or chord ?

There's another reason I'd like to keep using Parsimonious. Once I got the hang of it, adding new features has been an amazingly smooth process. It's mostly been

  1. Add to the grammar,
  2. Write test cases,
  3. Add a visitor and debug until the tests all pass.
JdeH commented 6 years ago

I'll close this for now. Should anything specific have to be added to Transcrypt to make Parsimonious work, let me know.

KR Jacques