boostorg / spirit

Boost.org spirit module
http://boost.org/libs/spirit
393 stars 161 forks source link

X3 lacks boolean parser for unicode char type #569

Open saki7 opened 4 years ago

saki7 commented 4 years ago

I think we could provide simple typedefs in <boost/spirit/home/x3/numeric/bool.hpp>

EDIT: bool_policies must be updated too; however there's some hard-coded string literals in bool_policies.hpp which simply do not match the unicode char type

Kojoley commented 4 years ago

Do you expect it to match synoglyphs/homoglyphs, duplicate characters in Unicode and other Unicode quirks that I am not aware of?

saki7 commented 4 years ago

No. It seems like the potentially required workarounds to handle those cases are beyond the scope of Spirit.

For 'other Unicode quirks' you've mentioned -- I came up with normalization (NFKC, NFKD, etc.), which I'm currently pre-processing the script in my application before passing it to X3. I think we could leave it to the user.

I think it's nice to have a documentation for Unicode support in X3, which describes the real-world use-case for building/using Unicode AST. I'm using it because my app is utilizing code point count for UTF-32 characters. By passing std::u32string::iterator to X3, I was able to reduce the string conversion. (I think this is getting off-topic; I'd love to hear more opinions about Unicode though)

djowel commented 4 years ago

(I think this is getting off-topic; I'd love to hear more opinions about Unicode though)

Definitely not off-topic. I'd love to see more work done in this area.

saki7 commented 4 years ago

For my original use-case, I'll be satisfied if U"true" gets parsed into true, for instance.

I can't imagine more complex cases for now but I really like X3's recent Unicode support so I agree with @djowel. I'll try to report more issues about Unicode soon.

saki7 commented 4 years ago

My suggestion is changing bool_policies from

template <typename T = bool>
struct bool_policies { /* ... */ };

to

template <typename CharT, typename T = bool>
struct bool_policies; // declaration only

template <typename T = bool> // partial specialization
struct bool_policies<char32_t, T> { /* ... */ };

(A cleaner way is to use if constexpr inside the body -- IIRC it's permitted to use C++17 in X3, yes?)

If we could reach consensus then I'll try to PR.

djowel commented 4 years ago

(A cleaner way is to use if constexpr inside the body -- IIRC it's permitted to use C++17 in X3, yes?)

I'm absolutely for a move to c++17! There will have to be some minor doc changes as I recall the requirement was c++14. It's 2020 now and we have c++20. Time to move on.

djowel commented 4 years ago

(A cleaner way is to use if constexpr inside the body -- IIRC it's permitted to use C++17 in X3, yes?)

Oh and a lot of the infrastructure code in x3 can be simplified with c++17. Your thoughts @Kojoley ?