foonathan / lexy

C++ parsing DSL
https://lexy.foonathan.net
Boost Software License 1.0
991 stars 66 forks source link

case_folding discards errors in validate and parse actions #149

Closed jan-kelemen closed 1 year ago

jan-kelemen commented 1 year ago

If an error occurs during parsing with case folding the error is discarded and the return value from validate and parse is incorrect.

Example of grammar on playground: https://lexy.foonathan.net/playground/?id=fccxs48G3&mode=trace

namespace grammar
{
constexpr auto kw = dsl::identifier(dsl::ascii::alpha, dsl::ascii::alpha_underscore / LEXY_LIT("-"));

constexpr auto kw_create = dsl::ascii::case_folding(LEXY_KEYWORD(u8"create", kw));

struct create_number
{
    static constexpr auto whitespace = dsl::ascii::space;
    static constexpr auto rule = kw_create + dsl::integer<std::size_t>(dsl::n_digits<3>) + dsl::eof;
    static constexpr auto value = lexy::as_integer<std::size_t>;
};

}

int main()
{
    using namespace std::string_literals;
    for (const auto line : { "create 123"s, "creat a"s })
    {
        lexy::string_input<lexy::utf8_char_encoding> input{line};

        if (auto result{lexy::parse<grammar::create_number>(input, lexy_ext::report_error)})
        {
            std::cout << "has_value(): " << result.has_value() << '\n';
            std::cout << "is_success(): " << result.is_success() << '\n';
            std::cout << "is_error(): " << result.is_error() << '\n';
            std::cout << "is_recovered_error(): " << result.is_recovered_error() << '\n';
            std::cout << "is_fatal_error(): " << result.is_fatal_error() << '\n';
            std::cout << "error_count(): " << result.error_count() << '\n';
            std::cout << '"' << line << "\" evaluates to " << result.value() << '\n';
        }
        else
        {
            std::cout << "fail";
        }
    }
}

The output of this program is:

has_value(): 1
is_success(): 1
is_error(): 0
is_recovered_error(): 0
is_fatal_error(): 0
error_count(): 0
"create 123" evaluates to 123

has_value(): 0
is_success(): 1
is_error(): 0
is_recovered_error(): 0
is_fatal_error(): 0
error_count(): 0

Afterwards it crashes on the following assertion:

lexy_example: /mnt/work/git/notes.txt/libraries/lexy/build/_deps/lexy-src/include/lexy/_detail/lazy_init.hpp:145: constexpr const T& lexy::_detail::lazy_init<T>::operator*() const & [with T = long unsigned int]: Assertion `*this' failed.

String, create 123 is parsed as expected, but the result of parsing the other one creat a is strange, it says it successful but it doesn't have a value. I would expect that it reports that the second string wasn't parsed.

If i remove dsl::ascii::case_folding from kw_create then the second string isn't parsed and the error is reported.

EDIT: I've debugged this problem further figured the following. This is caused by overloads of on method in class event_handler:

        constexpr void on(_vh&                                   handler, parse_events::error,
                          const error<Reader, expected_literal>& error)
        {
            handler._cb.literal(handler._cb.sink, get_info(), handler._cb.input, _begin, error);
        }
...
        template <typename Event, typename... Args>
        constexpr auto on(_vh&, Event, const Args&...)
        {
            return 0; // operation_chain_start must return something
        }

When parsing a case folding rule, the real Reader gets wrapped into a case_folding one and it doesn't match the Reader template parameter anymore, so the template overload gets called which doesn't do anything. This is the instantiated method from gdb: lexy::_vh<lexy::_prc>::event_handler::on<lexy::parse_events::error, lexy::error<lexy::_acfr<lexy::_prc>, lexy::expected_literal> >

foonathan commented 1 year ago

Thanks for catching that. Fixed.