devatrun / sutfcpplib

Simple UTF library for C++
MIT License
12 stars 0 forks source link

Reduce SFINAE parts in favor of 'if constexpr' #3

Closed mapron closed 2 years ago

mapron commented 2 years ago

I saw you already used 'if contexpr'. So for me logic of

////////////////////////////////////////////////////////////////////////////////////////////////////
template<typename char_t, std::enable_if_t<std::is_same_v<char_t, char> || std::is_same_v<char_t, char8s_t>, int>>
constexpr uint_t code_unit_count(uint_t cp) noexcept
{
    if (cp < 0x80)
        return 1;
    if (cp < 0x800)
        return 2;
    if (cp < 0x10000)
        return 3;

    return 4;
}

////////////////////////////////////////////////////////////////////////////////////////////////////
template<typename char_t, std::enable_if_t<std::is_same_v<char_t, char16_t> || (std::is_same_v<char_t, wchar_t> && sizeof(wchar_t) == sizeof(char16_t)), int>>
constexpr uint_t code_unit_count(uint_t cp) noexcept
{
    return cp < 0x10000 ? 1 : 2;
}

////////////////////////////////////////////////////////////////////////////////////////////////////
template<typename char_t, std::enable_if_t<std::is_same_v<char_t, char32_t> || (std::is_same_v<char_t, wchar_t> && sizeof(wchar_t) == sizeof(char32_t)), int>>
constexpr uint_t code_unit_count(uint_t /*cp*/) noexcept
{
    return 1;
}

Would be much clearer in form of

template<typename char_t> // i wish some char concept here, but no C++20, oh well. 
constexpr uint_t code_unit_count(uint_t cp) noexcept
{
   if constexpr (sizeof(char_t) == 1) {
    if (cp < 0x80)
        return 1;
    if (cp < 0x800)
        return 2;
    if (cp < 0x10000)
        return 3;

    return 4;
    } else if constexpr (sizeof(char_t) == 2) {
      return cp < 0x10000 ? 1 : 2;
    } else {
      return 1;
    }
}

or something.

If you don't like methods getting too long, then at least make you own traits or something.

Overall I wish you can provide concepts version instead of sfinae (so I can clearly understand API by signature). Docs can do that too, but I am a fan of self-documented code.

mapron commented 2 years ago

I missed a line when you using code_unit_count with 1 argument as well to calculate string code_unit_count. Sorry!

I personally think is a bad overload case and I prefer to have different names for calculation over code point and a sequence of them.

it's like having std::accumulate version for both int and int, int or something. Different conceptions. Should not be and overload.

devatrun commented 2 years ago

The main approach I'm using here is to make code_unit_count() would be safe to overload in the future and not participate in calls to non-Unicode types like code_unit_count<int>() at all, rather than break internally. Please note that the library expects that char8_t, char16_t, char32_t represent the code units, just like the C++17 standard does now, and support of char, wchar_t is added only for backward compatibility. Your suggestion to make signature in form of:

template<typename char_t>
constexpr uint_t code_unit_count(uint_t cp)

would be too greedy and generic for three fixed encodings and three code unit types. I agree that concepts would make the code more self-documenting, but C++17 doesn't support them.

mapron commented 2 years ago

I probably failed to bring my intention here. Of course I don't want type system to be weaker! Not at all. Probably (uint_t) part I left was misleading.

My point was to have more clear type interface, instead of SFINAE. Maybe I am asking for too much. Again, concepts would be perfect here.

mapron commented 2 years ago

Ok, what about this:

template<class T>
struct my_char_traits 
{
// no fields here
};

template<> struct my_char_traits<char> { constexpr const size_t code_unit_limit {1}; }
template<> struct my_char_traits<char32_t> { constexpr const size_t code_unit_limit {4}; }

template<typename char_t> // i wish some char concept here, but no C++20, oh well. 
constexpr uint_t code_unit_count(char_t cp) noexcept
{
    constexpr code_unit_limit = my_char_traits<char_t>::code_unit_limit;
    if constexpr (code_unit_limit == 1) {
        if (cp < 0x80)
        return 1;
        if (cp < 0x800)
        return 2;
        if (cp < 0x10000)
        return 3;

        return 4;
    } else if constexpr (code_unit_limit == 2) {
        return cp < 0x10000 ? 1 : 2;
    } else {
        return 1;
    }
}
mapron commented 2 years ago

still c++17 compat, you can't pass arbitrary int type (compilation error due tot missing my_char_traits member), and code is nice in one function. How do you like it?

devatrun commented 2 years ago

Sorry, for me your solution looks too complicated. We add additional traits, but we still break inside function if something goes wrong, and we still will be called with unexpected types. Just think, three different functions with three different types with tree different implementations, that can be simpler ?!

mapron commented 2 years ago

Ok, NP, that's fine!

I only wished to suggest an alternative, I treat your vision with dignity, as you are the author.