Closed mandubian closed 9 years ago
I could open up Incomplete
without any trouble, since it is effectively defined entirely by derive
and complete
. However, I should warn you that context-free grammars are not closed under complementation! In other words, a not
parser of the form you indicated is not going to be sound, and probably will result in corner cases that infinitely loop (and similar).
What exactly is your use case for complementation? Maybe there's a different way it can be accomplished, or a new combinator that can be added to make things easier.
Agreed on the potential infinite loops!
I wanted to test writing parsers to get the idea of your API. Like EDN string parsing such as EDN string is a "foo"
So you would have a rule like:
'"' ~> repeat(not('"')) <~ '"'
Any other way to do that?
I would handle that by having a tokenizing step. So, there would be a StrLit
token or similar that contains the contents of the string value. I have some code locally which does this. I'll push it up on a branch so you can take a look as soon as I get back home.
Here you go: https://github.com/djspiewak/sparse/blob/wip/json-example/src/main/scala/scalaz/stream/parsers/package.scala#L96 Basically, the tokenize
function takes a set of rules (of the form Map[Regex, PartialFunction[List[String], Token]]
) and produces a Process1
which does the tokenization for you (similar to the parse
function). You can see an example of it in action here: https://github.com/djspiewak/sparse/blob/wip/json-example/src/test/scala/scalaz/stream/parsers/JsonStreamSpecs.scala#L55
Yesterday evening, I was reading your samples and was thinking about tokenization with bufferization too. I was thinking a bit too much about combinators I think ;)
BTW a detail: EDN is cool for streaming as it allows EDN to be a succession of EDN values. In Json, a root {} or [] is mandatory. Json wasn't thought at all for streaming and actually it's a crappy format yet almost universal now ;)
Thanks for your help, I'm going to investigate tokens now!
On Fri, Jan 2, 2015 at 4:06 AM, Daniel Spiewak notifications@github.com wrote:
Here you go: https://github.com/djspiewak/sparse/blob/wip/json-example/src/main/scala/scalaz/stream/parsers/package.scala#L96 Basically, the tokenize function takes a set of rules (of the form Map[Regex, PartialFunction[List[String], Token]]) and produces a `Process1
— Reply to this email directly or view it on GitHub https://github.com/djspiewak/sparse/issues/3#issuecomment-68506774.
EDN is far superior to JSON for streaming. :-) I used JSON as an example mostly because I think it's probably the most common use case for something like this, at least in modern hipster servers. When streaming JSON, btw, it's pretty common to define a special "de facto non-standard" format that uses whitespace to delimit JSON tokens at the top level (instead of having a root array), and then normal JSON rules below that. We did this for JSON streaming when I was at Precog, and generally it works out pretty well and has fairly good compatibility across parsers.
In any case, I'm going to close this for now. If you find you need an unsealed Incomplete
after all, feel free to reopen!
I agree with you on everything :) Json is to data format what JS is to programming ;) Le 2 janv. 2015 20:34, "Daniel Spiewak" notifications@github.com a écrit :
Closed #3 https://github.com/djspiewak/sparse/issues/3.
— Reply to this email directly or view it on GitHub https://github.com/djspiewak/sparse/issues/3#event-213531725.
I wanted to implement a
not[Token](p: Parser[Token, Token])
but it's not possible outside of the project itself as everything is sealed for pattern matching.Is
sealed
mandatory for parsers (or at least forIncomplete
)?