Open ckoparkar opened 5 years ago
Sorry about the delay in responding here!
The omission of -
is intentional, but for pedantic reasons: I wanted to choose a standard identifier syntax that wasn't Haskell's so you could use a typical kebab-case identifiers (i.e. so you could use a haskLikeParser
and still be able to write expressions like this-and-that
in it), and so I chose R5RS identifiers, which are simpler than the identifiers that appear in some later Schemes: R7RS identifiers, for example, can include strings inside vertical bars (e.g. |this and that|
is a single identifier in R7RS). Strictly speaking, ->
isn't a valid identifier according to the R5RS spec or the R6RS spec, both of which have special treatment for the standalone identifier -
and otherwise only allow hyphens as the non-initial character. (You can see that the definition of parseR5RSIdent
heavily mirrors the spec, including the choice of names like peculiar
.)
That said, I think it makes sense to allow this! I think for the sake of being pedantic, I'll define a new atom parser called something like parseR5RSIdentLax
, so it can specify that it's not a strict implementation of the R5RS identifier rule, and so it can include a comment about the specific way it differs from the spec, and then update haskLikeParser
to use that one instead. If you've got a PR on hand that makes the change, I'd be happy to build on that; otherwise, I'll happily put that in and get a new release within the week!
re:why R5Rs - Ah, it makes sense now. That didn't occur to me.
Gosh, I wasn't even looking in the correct place. What I linked to is just a list of identifiers, not the AST! I don't know what I was thinking. Sorry about that, and thanks for pointing me to the right thing.
Now that I know it isn't R5RS at all, I feel strange for requesting this change. Parsing without it isn't bad -- I had to look for [-,>]
instead of ->
and it worked nicely.
That said, I think it makes sense to allow this!
You must have already considered this, but please only add it if you think it plays nicely with other things :)
re:parseR5RSIdentLax Absolutely! Unfortunately, I don't have anything PR ready. I only messed around a bit with parseR5RSIdent
, but that's it.
Hi @aisamanra, we're using
haskLikeParser
to parse s-expressions, and I ran into a problem while trying to parse the arrow notation,->
. Specifically, the parser considers-
and>
to be two separate atoms. So the parse result is[RSAtom (HSIdent "-"), RSAtom (HSIdent ">")]
. Whereas I thought it would be[RSAtom (HSIdent "->")]
pHaskLikeAtom
usesparseR5RSIdent
to decide what constitutes an atom.https://github.com/aisamanra/s-cargot/blob/71d2ec896f50877110df8622de3efdccfa6b8d59/Data/SCargot/Common.hs#L49-L54
And sure enough,
-
is missing frominitial
. Was this omission intentional ?I don't have any experience writing parsers to tell if adding
-
there is something sensible. Is it going to break some corner case of parsing-INT
s ? As far as I understand, it's OK wrt the spec ?I made that change and tried it on a small example which uses
->
s, negative numbers andabc-defs
s and that worked fine. Also, all s-cargot tests still pass, but they're not really using-
s I think.For now, I'm going to use
~>
or=>
orA '-' : A '>'
to mean arrows, but I could whip up a PR to make this change. What are your thoughts ?[EDIT: I apologize, I did not see your GitHub status before I tagged you.]