Closed dkulchenko closed 1 year ago
Working around things with this:
defmodule ParserError do
import NimbleParsec
def fail(combinator, error) do
post_traverse(combinator, {__MODULE__, :add_error, [error]})
end
def add_error(_rest, acc, context, line, offset, error) do
annotated_error = %{message: error, line: line, offset: offset}
{acc, Map.update(context, :errors, [annotated_error], &[annotated_error | &1])}
end
def throw_errors(combinator) do
post_traverse(combinator, {__MODULE__, :throw_errors, []})
end
def throw_errors(_rest, _acc, %{errors: errors}, _line, _offset) when errors != [] do
{:error, errors}
end
def throw_errors(_rest, acc, context, _line, _offset) do
{acc, context}
end
end
defmodule Parser do
import ParserError
import NimbleParsec
defparsec(
:contents,
choice([
string("123</abc>"),
string("123") |> fail("missing closing </abc> tag")
])
)
defparsec(
:parse,
choice([
string("<abc>") |> parsec(:contents),
string("<abc>") |> fail("unknown syntax error inside <abc> tag")
])
|> throw_errors()
|> eos()
)
end
IO.inspect(Parser.parse("<abc>123"))
which works great and gives the bonus of accumulating errors and providing context for each. Leaving this open in case this is a bug/unintended behavior.
I am looking at the code and choices does not actually guarantee the order they are proceed or which one will be marked as failed in case all of them fail. I will document this behaviour but it may also be possible to write this with a different set of combinators.
I am looking at the code and choices does not actually guarantee the order they are proceed or which one will be marked as failed in case all of them fail.
Are you sure? Playing with some scenarios with debug: true
on, it looks like choice()
always generates function clauses in an execution order exactly matching that in which they're provided, even for complex trees (which is inherently desirable for determinism/optimization, so I'm perfectly happy if this is unintentional!).
Across ~50 combinators in my parser that are relying on strict choice()
ordering, every test that relies on this behavior succeeds consistently.
re: error handling though, makes sense - perfectly happy using context + a throw at the end for this purpose.
In this test case, I'm trying to use choice() as a way to do fallback error handling (basically, either match the happy path, or match only the beginning of the syntax and throw an error).
Running the script has the following output:
So the first error gets "swallowed". Is there anything that can be done to expose the "child" error?
(This is a super simplified scenario and in the actual project, there's a huge web of choice() calls, but this is the simplest synthetic test case that triggers the issue.)
Using function composition instead of parsec() works correctly:
returns:
I'd love to avoid parsec() and just use direct function calls but unfortunately it's a complex self-referential parser and unwinding the circular dependencies would be incredibly difficult.