fenbf / cppstories-discussions

4 stars 1 forks source link

2022/ssize-cpp20/ #97

Open utterances-bot opened 2 years ago

utterances-bot commented 2 years ago

Reducing Signed and Unsigned Mismatches with std::ssize() - C++ Stories

In this article, I’ll show another technique that avoids mixing signed and unsigned types. In my article Integer Conversions and Safe Comparisons in C++20 we learned about cmp_* integer comparison functions that allow us to compare various types of integer types. The functions are safe because they help with mixing signed and unsigned types.

https://www.cppstories.com/2022/ssize-cpp20/

2kaud commented 2 years ago

For unsigned printReverse(), there are issues with the code above. The first example assumes that i will go negative when it can't as it's unsigned and the second assumes that i 'wraps' to the max possible positive value when decremented from 0 which isn't guaranteed by the standard (but is usual but nevertheless is UB) [See Scutum13's comment below].

For printReverse() consider:

#include <iostream>
#include <vector>

void printReverse(const std::vector<int>& v) {
    for (auto i { v.size() }; i > 0; --i)
        std::cout << i - 1 << ": " << v[i - 1] << '\n';
}

which displays as required: 4: 5 3: 4 2: 3 1: 2 0: 1

At the risk of a flame war, I'm firmly in the unsigned camp! The number of elements can never be negative. Also, the max value for an unsigned is significantly greater than that for a signed. This probably doesn't have much implication for 64-bit build, but it does for 32-bit build.

Some use type int (or ptrdiff_t) because they also want to indicate an error as say -1 for element not found. This is really using one variable for 2 purposes. In modern C++ this is no longer needed. You can, for instance, use std::optional to hold the found element as size_t or not a value if not found etc.

For has_repeated_value() then:

#include <iostream>
#include <vector>
#include <iomanip>

template <typename T>
bool has_repeated_values(const T& container) {
    for (size_t i {1}; i < container.size(); ++i)
        if (container[i - 1] == container[i])
            return true;

    return false;
}

int main() {
    const std::vector<int> v {1,1};

    std::cout << std::boolalpha << has_repeated_values(v) << '\n';
}

Like with printReverse(), when working with unsigned, the index range has to set such that the index never tries to be decremented past 0.

fenbf commented 2 years ago

join the discussion at Reddit/c++: https://www.reddit.com/r/cpp/comments/xibxl0/reducing_signed_and_unsigned_mismatches_with/ more than 90 comments are available!

fenbf commented 2 years ago

Thanks @2kaud for your comment and for the examples. I think the critical part is that a developer has to be aware of the arithmetic and how basic expressions work. Then you can apply correct rules to both unsigned or signed code.

2kaud commented 2 years ago

Agreed! That should be part of C++101 but IMO unsigned tends not to be covered - hence the issues that are found...

fenbf commented 2 years ago

What's more, we have more memory to use in 2022, so having a signed size is not an issue, as it could have been in 90s. Of course, there are still some memory constraints on some embedded chips, but in that case, developers are aware (I hope! :)) of the limitations.

ghost commented 2 years ago

Another solution is to have reverse looping implemented directly in the language. Or some kind of template or metaprog thing that does it correctly. In addition, that's less a solution but compilers can mitigate the root problem by promoting operands when comparing signed and unsigned.

2kaud commented 2 years ago

"reverse looping implemented directly in the language"

It does. Consider:

#include <iostream>
#include <vector>
#include <ranges>

void printReverse(const std::vector<int>& v) {
    for (size_t i {v.size()}; const auto& e : std::ranges::reverse_view(v))
        std::cout << --i << ": " << e << '\n';
}

int main() {
    const std::vector v { 1, 2, 3, 4, 5 };
    printReverse(v);
}

which displays: 4: 5 3: 4 2: 3 1: 2 0: 1

Or as an alternative to reverse_view:

void printReverse(const std::vector<int>& v) {
    for (size_t i { v.size() }; const auto & e : v | std::views::reverse)
        std::cout << --i << ": " << e << '\n';
}

whichever floats your boat....

2kaud commented 2 years ago

If your compiler doesn't currently support std::ranges (in C++20), then you can do something like this using iterators (often a better alternative to using indexes) and a helper function (ranger) which takes two iter params and allows it to be used with range-for. These needn't be .begin(), .end(). In this usage example, .rbegin() and .rend() are used for reverse iteration.

#include <iostream>
#include <utility>
#include <vector>

template <typename Iterator>
struct ranger : public std::pair <Iterator, Iterator> {
    ranger(Iterator begin, Iterator end = Iterator()) : std::pair <Iterator, Iterator> { begin, end } { }
    Iterator begin() { return this->first; }
    Iterator end() { return this->second; }
};

void printReverse(const std::vector<int>& v) {
    for (size_t i { v.size() }; const auto& e: ranger(v.rbegin(), v.rend()))
        std::cout << --i << ": " << e << '\n';
}

int main() {
    const std::vector v { 1, 2, 3, 4, 5 };
    printReverse(v);
}

There's also the 'old' standby using iterators direct:

#include <iostream>
#include <vector>

void printReverse(const std::vector<int>& v) {
    size_t i { v.size() };

    for (auto itr { v.rbegin()}; itr != v.rend(); ++itr)
        std::cout << --i << ": " << *itr << '\n';
}

int main() {
    const std::vector v { 1, 2, 3, 4, 5 };
    printReverse(v);
}

IMO either of which are 'better' than using element indexing.

Scutum13 commented 2 years ago

@2kaud: It is guaranteed by the standard that subtracting one from ​unsigned 0​ gives maximum value (see https://en.cppreference.com/w/cpp/language/operator_arithmetic#Overflows), so example printReverseUnsigned() will work as expected. Nevertheless you have to read it twice to understand the author's intention. I always prefer the reverse iterator approach.

2kaud commented 2 years ago

Thanks for the clarification. I've added a note to that post. I always knew in practice this worked that way but I wasn't aware it was guaranteed.