joemalle / limn

A tiny parser designed to compile quickly
Boost Software License 1.0
2 stars 1 forks source link

add a skip function which can automatically remove the whitespaces #8

Closed asmwarrior closed 1 year ago

asmwarrior commented 1 year ago

For some reasons, I think a skipper function is needed.

I'm not sure how to implement it. I just did such changes:

/// @namespace lm
/// @brief The namesapce for all Limn types, functions, and variables
namespace lm {
    namespace impl {

        /// call the skip function when we run the visit()
        constexpr bool skip(std::string_view& sv) noexcept {
            if (!sv.empty() && sv.front() == ' ') { // skip the whitespace
                sv.remove_prefix(1);
                return true;
            }
            return false;
        };

        template <typename Base>
        struct parser_base {
            constexpr auto operator*() const noexcept;
            constexpr auto operator+() const noexcept;
            constexpr auto operator[](std::string_view& output) const noexcept;
            constexpr auto operator[](std::function<void(const std::string_view&)> callback) const noexcept;

        };
    }

Note I have add a function named skip which can remove the whitespaces, and now I call it in the visit function body like:

        template <typename Base>
        struct match_ final : public impl::parser_base<match_<Base>> {
            constexpr explicit match_(Base base, std::string_view& sv) noexcept
                : base(std::move(base))
                , out(sv)
            {}

            constexpr inline bool visit(std::string_view& sv) const& noexcept {
                impl::skip(sv);
                std::string_view save = sv;
                if (base.visit(sv)) {
                    out = save.substr(0, save.size() - sv.size());
                    return true;
                }
                return false;
            }

        private:
            Base base;
            std::string_view& out;
        };

You can see that when visit function entered, I just call the impl::skip(sv);, I put this function in many visit functions, and it looks OK.

I'm not sure there are better methods or not, any suggestions? Thanks.

asmwarrior commented 1 year ago

FYI: I have add the skip function in my own branch, see here:

https://github.com/asmwarrior/limn/tree/add_skip_function

Also, I have use a new C++ test framework doctest instead of the old plain assert in this branch and add some other test files.

asmwarrior commented 1 year ago

I see one big issue when using such skip function.

if skip is used, then I will have

parse("a b", char_('a') >>char_('b'));

will return true.

This is not correct.

I see in boost's document, the skip function can be configured.

See: c++ - Boost spirit lexeme and its attributes - Stack Overflow

Parser Directive Inhibiting Skipping (lexeme[]) - 1.53.0

So, I would like to create a lexeme like type, which have skip function disabled, is that possible?

Currently, I have:

    namespace impl {

        /// call the skip function when we run the visit()
        constexpr bool skip(std::string_view& sv) noexcept {
            // whitespace could be: [ \t\r\n]+, see below
            // https://en.cppreference.com/w/cpp/string/byte/isspace
            if (!sv.empty() && 0 != std::isspace(sv.front())) { // skip the whitespace chars
                sv.remove_prefix(1);
                return true;
            }
            return false;
        };

        template <typename Base>
        struct parser_base {
            constexpr auto operator*() const noexcept;
            constexpr auto operator+() const noexcept;
            constexpr auto operator[](std::string_view& output) const noexcept;
            constexpr auto operator[](std::function<void(const std::string_view&)> callback) const noexcept;
        };
    }

Is it possible to add a member variable or member function in the class template parser_base. So, if the bool member is true, we can run the skipper, if not, we do not run it. Or can we make it more general by some C++ template?

Any ideas?

Thanks.

asmwarrior commented 1 year ago

Maybe a solution is that we can add a second function parameter to the visit function, so for example

    /// @class lit_
    /// @brief String literal parser
    /// @details An object of this type matches a character sequence (AKA
    ///     a string literal).  For example, `lm::lit_("tautological")`
    ///     would parse the string "tautological".
    struct lit_ final : public impl::parser_base<lit_> {
        /// @brief Construct a lit_ parser.
        /// @param[in] str The string literal to parse.
        constexpr lit_(std::string_view str) noexcept
            : str(str)
        {}

        constexpr inline bool visit(std::string_view& sv) const& noexcept {
            if (sv.substr(0, str.size()) == str) {
                sv.remove_prefix(str.size());
                return true;
            }
            return false;
        }

    private:
        std::string_view str;
    };

Here, the line is changed to:

        constexpr inline bool visit(std::string_view& sv, Skipper skipper) const& noexcept {

And the Skipper is function pointer.

By default, the constexpr bool skip(std::string_view& sv) noexcept will be used, but it could be another function pointer.

But there are still many issues.

For example, when parsing, there are many objects will be created(all were derived from the parser_base class), some high level objects should have skipper enabled, while in lower level(lexer level), the skipper should be disabled). I still don't know how to do it.

asmwarrior commented 1 year ago

I have finally use the method that extend the visit function to accept two arguments, one is the string_view, the new added one is the skipper, and have push those features in the master now. So, I will close this issue.