Open ncalexan opened 7 years ago
@rnewman @grigoryk worth considering as we build our libraries.
So, it appears that nom doesn't do a great job with this: see https://github.com/Geal/nom/issues/266, which is harder than our problem, but similar. I think nom could be bent to our purposes, but it's clear that bytes are the first-class thing and arbitrary tokens second-class.
Perhaps combine is a better option: https://docs.rs/crate/combine
Perhaps combine is a better option: https://docs.rs/crate/combine
Indeed there's a JSON parser in the combine
test suite: https://github.com/Marwes/combine/blob/master/benches/json.rs. It wouldn't be too hard to turn that into an EDN parser.
A few more notes after playing with combine
a bit: generally, folks migrate from nom
to combine
to get better error reporting. I can't speak to the differences yet.
There are some kinks when defining higher order parsers using combine
; the obvious const p: ... = many(...)
doesn't work. I think this is actually rooted in Rust's need to know the type and size of p
statically, which doesn't play that nicely with combine
in some cases. This may just be my Rust ignorance, though!
Following the discussion in https://github.com/mozilla/datomish/pull/139, I expect we'll be parsing EDN and then layering on additional parsers (query parser, transaction parser, perhaps others). It would be nice if we could use the same parser generation toolkit for parsing strings and EDN token streams.
From http://dragostis.github.io/pest/pest/trait.Input.html, I see that pest's intention is to parse strings (or byte streams). (e.g.,
http://dragostis.github.io/pest/pest/trait.Input.html#tymethod.slice
accepts a generalInput
but requires it to effectively be a string.)I see that nom is more general. (Doubtless there are other, more general, PEG generators as well.) From https://github.com/Geal/nom#parser-combinators: