joemalle / limn

A tiny parser designed to compile quickly
Boost Software License 1.0
2 stars 1 forks source link

endless loop when handling "@" with rule *(*alnum_) #9

Closed asmwarrior closed 1 year ago

asmwarrior commented 1 year ago

Hi, I first found this issue when using

assert(parse("abc def @", *(*alnum_) >> end_));

Note in the above code, I have a skipper function to remove the whitespaces, mentioned in issue #8 .

And later I minimize the issue to this code (which does not need the skipper function)

assert(parse("@", *(*alnum_) >> end_));

Here, the @ never matches any alnum_, so it becomes an endless loop.

Is there any way to avoid such issue?

Thanks.

asmwarrior commented 1 year ago

When debugging, I see that this function lm::impl::kleene_::visit get recursively called.

        template <typename Base>
        struct kleene_ final : public impl::parser_base<kleene_<Base>> {
            constexpr explicit kleene_(Base base) noexcept
                : base(std::move(base))
            {}

            constexpr inline bool visit(std::string_view& sv) const& noexcept {
                impl::skip(sv);
                while (base.visit(sv) && !sv.empty());
                return true;
            }

While the inner visit function always return true, so the outer visit function will running the while endlessly.

The question here is: how to let the outer visit function return false?

asmwarrior commented 1 year ago

Some reference page: peg/grammar-tips.md at master · xored/peg

Which mentioned:

Be careful with rules which can match empty input.

Repetition: A* can match empty input regardless of how A is defined.
Predicates: also match empty input. E.g. EOF rule defined above.
asmwarrior commented 1 year ago

I found a solution here to avoid the endless loop

            constexpr inline bool visit(std::string_view& sv) const& noexcept {
                impl::skip(sv);
                std::string_view save = sv;
                // save != sv means we does forward (base.visit(sv) consume some chars)
                while (base.visit(sv) && !sv.empty() && save != sv);
                return true;
            }

By comparing the save and sv, we know whether base.visit(sv) has consume some chars, if not, we just stop the loop.

asmwarrior commented 1 year ago

OK, a more improved implementation:

            constexpr inline bool visit(std::string_view& sv) const& noexcept {
                impl::skip(sv);
                std::string_view save = sv;
                // save != sv means we does step forward (base.visit(sv) consume some chars)
                // the asignment save = sv means the sv get updated, so try next loop
                // to see it goes forward again
                while (base.visit(sv) && !sv.empty() && save != sv)
                    save = sv;
                return true;
            }

Both the kleene_::visit and the plus_::visit should have such loop code.

asmwarrior commented 1 year ago

I think I have fixed this in the master, so I will close it.