atcoder / ac-library

AtCoder Library
Creative Commons Zero v1.0 Universal
1.84k stars 236 forks source link

Fenwick tree documentation, undefined behavior when signed integer overflows #126

Closed cai-lw closed 3 years ago

cai-lw commented 3 years ago

The documentation (https://atcoder.github.io/ac-library/production/document_en/fenwicktree.html) states:

If T is integer type(int, uint, ll, or ull), it returns the answer in  mod 2bit\bmod 2^{\mathrm{bit}}mod2bit, if overflowed.

However, signed integer overflow is undefined behavior in C++. Although it seems extremely unlikely for the Fenwick tree implementation to trigger any unusual behavior, the statement above cannot be guaranteed to be true for signed integers, because it depends on compiler implementation.

yosupo06 commented 3 years ago

For avoiding undefined behavior, ACL's fenwick tree use unsigned integer inside when T is signed type

yosupo06 commented 3 years ago

https://github.com/atcoder/ac-library/blob/master/atcoder/fenwicktree.hpp

    using U = internal::to_unsigned_t<T>;

We use this U instead of T

cai-lw commented 3 years ago

Oh, I missed that line. It's a great solution.

However I still find the following from the standard. https://en.cppreference.com/w/cpp/language/implicit_conversion#Integral_conversions

  • If the destination type is signed, the value does not change if the source integer can be represented in the destination type. Otherwise the result is implementation-defined (until C++20)the unique value of the destination type equal to the source value modulo 2^n where n is the number of bits used to represent the destination type. (since C++20). (Note that this is different from signed integer arithmetic overflow, which is undefined).

Overflowing when converting from unsigned integer to signed integer is not undefined behavior but still implementation-defined until C++20. Since AtCoder doesn't support C++20 yet, it applies to this library.

Specifically, the conversion from U to T in line 30 below is implementation-defined and cannot be guaranteed to be mod 2^n.

https://github.com/atcoder/ac-library/blob/89d5d0ad37b795e880efd28d13978a1f476c1654/atcoder/fenwicktree.hpp#L28-L31

Of course, there is no need to fix it if it works with all compilers (G++, Clang) and architectures (x86_64) that AtCoder supports. But if we really want to avoid implementation-defined behavior, this StackOverflow question may be helpful: https://stackoverflow.com/questions/13150449/efficient-unsigned-to-signed-cast-avoiding-implementation-defined-behavior

yosupo06 commented 3 years ago

Overflowing when converting from unsigned integer to signed integer is not undefined behavior but still implementation-defined until C++20. Since AtCoder doesn't support C++20 yet, it applies to this library.

Yes, you are right. Strictly, ACL will cause implementation-defined behavior until C++20.

Therefore, we adopted a bit dirty solution. Please see the "Environments" section of Appendix. We decided to assume the behavior around signed-integer solution regardless the C++ version.

As you mentioned, we can avoid implementation-defined behavior technically ..., but I believe this assumption doesn't cause any problems.

yosupo06 commented 3 years ago

BTW, Signed integers are two's complement is a long document. Most important part is the following.

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type).