Closed Bartvds closed 10 years ago
This is where it gets quoted: https://github.com/Bartvds/parsimmon/blob/7cd7d2833ecac312721eaaced47bb04b4c055c15/src/parsimmon.js#L252
I am okay with removing the quotes as long as ParseError still has a reasonable ::toString()
in which they are quoted.
Cool, sounds good. But there is a complication:
If we have text we try to parse for a string EOF
but it fails, then this would be the same value as the end-of-file error (also EOF
). (same goes for any character
, zero alternates
etc).
By the time the expected-value arrives in the showError()
this distinction is lost which makes quoting hard, and then how would a reporter know the value was an expected string or a symbol?
It'll have to be stored differently in the makeError()
data.
We could make the EOL value (and other symbols) a local object (eg: unique reference) and check for that. So when a string or RegExp arrive at the ParseError the are literals, and objects get tested as symbol. We'd add a .name
and a toString()
etc.
@Bartvds I'm confused why this is a problem, this is a feature not a bug, I would think the fact that parsers return a quoted string as expectation for a string parse and a non-quoted string for 'symbols' should makes it easier, not harder, to handle the error and make a nice report. In fact the whole reason it currently does that at all is to make the reported error message nicer, as formatted by parseError()
. By design, the contract on expected
is that it's a human-readable string description of the expected value.
If the problem is that this description string is too long because the expected string (or, in the case of item 2 of #19, regex) is itself too long, well, truncating it down to a reasonable length is part of making it human-readable, so that should not be handled in parseError()
(or showError()
), that should be handled in the primitive parser itself (Parsimmon.string()
or .regex()
), before even being passed to makeFailure()
. For example, https://github.com/Bartvds/parsimmon/blob/7cd7d2833ecac312721eaaced47bb04b4c055c15/src/parsimmon.js#L252 could be changed to:
var expected = "'" + str.slice(0, 8) + (str.length > 8 ? '...' : '') + "'";
Note that the code for dealing with regexes that are too long would be totally different; neither belongs in parseError()
or showError()
.
I also plan on adding some sort of tagging system for custom parse errors - "expected an expr", "expected a string literal", etc.
@laughinghan I would always prefer to also have the whole expected value available to the using code (as-is).
I don't think presentation logic like trimming and quoting belongs in the parser. It is convenient, of course, but only as an extra. The primitive parser has no business doing end-user presentation logic.
The thing with result object is that you now don't have to limit the data for just single-line string presentation. Maybe we want to throw our own AssertionError (expected/actual), of put the data to TAP or have expandable error messages (or whatever).
@jayferd A tagggin system would be very welcome. RegExp's are fun but not very communicative (these unicode ranges are killing us :)
@Bartvds Well, you could have a separate expectedValue
that has the whole string or regex value, and it could be undefined for any
/all
/eof
, so you could check if .expected === 'eof'
(as opposed to if .expected === "'eof'"
, in which case .expectedValue === 'eof'
). fail()
would take in a second argument.
A tagging system is fine and dandy for extensibility like expandable error messages or whatever, but for the common task of human-readable stringifying, that we're even suggesting branching on tags makes this a clear case of where we should be using polymorphism instead of branching on the type of the expected thing, right @jayferd?
@laughinghan Yea, we'd definitely need the real expected value (or a symbol) next to the pre-stringified quoted thing.
I don't understand what you mean with "branching on tags".
But as illustration:
For example I use a validation library Joi that feels similar to Parsimmon's style (eg: chainy-api creating immutable objects), and it has nice helpers like tags()
and description()
that create annotated versions. it is pretty cool, the report is very detailed too.
I forgot I noticed a .desc()
method landed! Very cool, now I can get rid of the workaround of that ridiculously long utf-letter RegExp message :wink:
I'll close this then.
I've been working on the ParseError as per #19, but have a interesting problem:
Some code:
https://github.com/Bartvds/parsimmon/blob/7cd7d2833ecac312721eaaced47bb04b4c055c15/src/parsimmon.js#L14-L26
Thing is: currently the parsers can return a quoted string as expectation for a string parse (eg:
'abc'
, or a non-quoted (vanilla) string for 'symbols' likeeof
etc. This makes it harder to handle the error and make a nice report (doable but a little messy).I'm not sure how to put that distinction between parse strings and 'symbols' in the
expected
value of the error.Any suggestions?