craff / pacomb

A parsing library that compiles grammars to combinators using elimination of left recursion
MIT License
18 stars 2 forks source link

Put a message in parse error #9

Closed craff closed 5 years ago

craff commented 5 years ago

The simplest approach is to collect terminal names at the position of the error. On problem is that we have to do it when the test the accepted charset too.

rlepigre commented 5 years ago

Does the charset test really buys us anything in terms of efficiency? In any case, good error messages are more important than efficiency I think.

craff commented 5 years ago

The charset test is just crucially important for efficiency (factor 10? loose linear in space in many cases). However, with the recent change, a set of terminals would even be better and this would have two benefits:

Moreover, we can use a key at creation for identity of terminals thanks to the GADT ...

craff commented 5 years ago

The other solution of a "error constructor" failing and adding an error message is incompatible with parse_all_XXX functions returning all parse tree ... unless we deactivate them in these cases, loosing the error messages ... BOF.

Still terminals give very local message like "waiting for ')' or '+' and got '*' which does not explain the error in term of the grammar. Maybe reparsing with an instrumented comb library would be a better solution.

rlepigre commented 5 years ago

I think that having an error combinator would be great, and I don't think it is very important to be able to get all possible parse trees since we generally work with grammars that are non-ambiguous. As I already said, providing a way to give relevant parse error messages is important.

craff commented 5 years ago

getting all parse tree in the only way to debug ambiguity (or trying to get at least to parse trees, actually would be good to give a limit ... A solution is also to return a "next" function after parsing ...

And I want to rework on my french parser soon ... so I need ambiguity

And anyway I was wrong to say that it is incompatible... that was stupid, we get an error only if parse_all gives 0 parse_tree and we will get the latest position. Actually in case of error, parse and parse_all do exactly the same thing!

craff commented 5 years ago

error combinator do not work at all ! see the branch with_error and the comment of its last patch:

" try an error combinator, tested on calc_ext. 2 problems:

craff commented 5 years ago

Moving from charset to terminals is probably the best solution: pros: - better asymptotic complexity

craff commented 5 years ago

In fact, we know the position of each terminals in the grammar ... So we can say "expression with missing closing parenthesis or type with missing closing parenthesis"

Basically a nice error messages can be produced from

The only cost for the end used is to give good names to its grammar.

craff commented 5 years ago

This is done.

craff commented 5 years ago

Tested in calc_ext.ml