microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.05k stars 1.48k forks source link

<chrono>: ceil() mishandles floating-point conversions #1479

Open StephanTLavavej opened 3 years ago

StephanTLavavej commented 3 years ago

This was found by @jwakely.

C:\Temp>type meow.cpp
#include <cassert>
#include <chrono>
#include <cmath>
#include <cstdio>
using namespace std;
using namespace chrono;

// N4868 [time.duration.cast]/7:
// "template<class ToDuration, class Rep, class Period>
//   constexpr ToDuration ceil(const duration<Rep, Period>& d);
// Returns: The least result t representable in ToDuration for which t >= d."

int main() {
    const duration<float, milli> ms{13421772.0f};
    printf("   ms.count(): %.1000g    %.6a\n", ms.count(), ms.count());

    const float scaled = ms.count() * 1000.0f;
    const float before = nextafterf(scaled, 0.0f);
    const float after  = nextafterf(scaled, 1e30f);

    printf("       before: %.1000g %.6a\n", before, before);
    printf("       scaled: %.1000g %.6a\n", scaled, scaled);
    printf("        after: %.1000g %.6a\n", after, after);

    const auto us = ceil<microseconds>(ms);
    printf("ceil returned: %lld\n", us.count());

    const microseconds correct{13421771265};
    printf("Correct value: %lld\n", correct.count());
    assert(correct >= ms);

    const microseconds too_small{correct.count() - 1};
    assert(!(too_small >= ms));

    assert(us == correct);
}
C:\Temp>cl /EHsc /nologo /W4 meow.cpp
meow.cpp

C:\Temp>meow
   ms.count(): 13421772    0x1.999998p+23
       before: 13421770752 0x1.8ffffcp+33
       scaled: 13421771776 0x1.8ffffep+33
        after: 13421772800 0x1.900000p+33
ceil returned: 13421771776
Correct value: 13421771265
Assertion failed: us == correct, file meow.cpp, line 35

This is an interesting case because our implementation (and libstdc++, and libc++) isn't meeting the Standard's specification for ceil(), as proven by the assertions - that's the bug. However, the specifications for ceil() and the duration comparisons are handling these floating-integral conversions strangely - that's the LWG issue needed.

First, the value 13421772 (the original number of milliseconds) is exactly representable as a float. When multiplying it by 1000.0f, the values before, scaled, and after show what's happening. We get the scaled value 13421771776 because that's closest to the mathematical answer 13421772000 (it's 224 away). The after value 13421772800 is further away (800 away) from the mathematical answer. This is an unavoidable fact of single-precision floating-point granularity.

Then, something interesting is happening in the specification of the duration comparisons. WG21-N4868 [time.duration.comparisons] says that the LHS and RHS are converted to their common_type_t before comparing their stored values. This has the unusual consequence of saying that 13421771265 microseconds (stored in long long representation) is >= duration<float, milli>{13421772.0f} despite the fact that this is not true from a mathematical units perspective! The common type is duration<float, micro>, so this performs two conversions. For the LHS, converting 13421771265 from long long to float (keeping the ratio unchanged) needs to perform rounding - this value is just above the midpoint between before and scaled so it gets rounded to scaled. (The value in too_small is the exact midpoint, so it gets tiebreak-to-even, and the hexit c is even for float, so it would be rounded to before.) Then for the RHS, the duration<float, milli> has to be converted to duration<float, micro>, which performs a duration_cast in the converting constructor, and that multiplies by 1000.0f - we already saw that the answer is scaled. So the LHS got rounded up, the RHS got rounded down, and they ended up being equal, even though they're "really" less-than. Our implementation is performing duration comparisons correctly, but the Standardese here has very unintuitive effects - perhaps it should be changed.

Finally, our implementation of ceil (like every other implementation) completely failed to anticipate this scenario. It simply does: https://github.com/microsoft/STL/blob/19c683d70647f9d89d47f5a0ad25165fc8becbf3/stl/inc/chrono#L417-L428 This performs a duration_cast and adjusts upwards by 1 if the result is too small - but here the result of duration_cast is much too large (according to the questionable specification of the comparisons) and we are supposed to return a significantly smaller value.

While we could change our implementation to meet the Standard's specification (somehow - I don't know the exact approach yet, that would handle all of the corner cases), perhaps we should wait for the Standard to be changed. If duration comparisons were specified to return mathematically correct answers, then I believe ceil would be specified to return 13421772000 here (which is totally representable in long long, just not float).

Finally, @HowardHinnant notes that (1) the performance of duration operations is critical (adding penalties that wouldn't be present in handwritten code would be bad), and (2) any attempt to improve floating-point conversions shouldn't introduce overflow where previously there was none. I don't know whether duration comparisons can be improved in the Standard; if they can't, I don't know whether ceil specifically can be specified differently; if its specification remains unchanged, then we should change our implementation (somehow) to conform, with the least perf impact possible.

(Note: I am almost certain that floor() is equally affected. round() may be affected, I'm not sure.)

StephanTLavavej commented 3 years ago

I found the following bug - related enough to be mentioned here instead of in a separate issue:

C:\Temp>type meow.cpp
#include <chrono>
#include <cstdio>
using namespace std;
using namespace chrono;

void test(const double dbl) {
    const duration<double> dur_dbl{dbl};
    const auto dur_flt = ceil<duration<float>>(dur_dbl);
    printf("dur_dbl.count(): %.1000g\n", dur_dbl.count());
    printf("dur_flt.count(): %.1000g\n", dur_flt.count());
    if (dur_flt >= dur_dbl) {
        printf("PASS\n\n");
    } else {
        printf("FAIL\n\n");
    }
}

int main() {
    test(13421771263.0);
    test(13421771264.0);
    test(13421771265.0);
}
C:\Temp>cl /EHsc /nologo /W4 meow.cpp
meow.cpp

C:\Temp>meow
dur_dbl.count(): 13421771263
dur_flt.count(): 13421770752
FAIL

dur_dbl.count(): 13421771264
dur_flt.count(): 13421770752
FAIL

dur_dbl.count(): 13421771265
dur_flt.count(): 13421771776
PASS

This appears to be "just a bug" but we may want to wait to see whether ceil's specification will change. (The bug is that, when the result is floating-point, we can't just add 1. In fact, there is no requirement for whole-numberness.)

johnsalmon commented 3 years ago

You may already know this - but FWIW, the issue is not limited to float. Doubles are just as surprising. The code below applies ceil<nanoseconds> and floor<nanoseconds> to reasonable values of duration<double, micro>. As in the OP, there are two issues: 1) current library implementations of ceil and floor (I used libstdc++, but I think STL will do the same) don't do what the standard specifies, and 2) the 'correct' results, specified by the standard, are very surprising. E.g., correct_ceil<nanoseconds>(x) < correct_floor<nanoseconds>(x)

I hope the standard can be changed to be less confusing and error-prone.

#include <cassert>
#include <chrono>
#include <cmath>
#include <cstdio>
#include <iostream>
using namespace std;
using namespace chrono;

// N4868 [time.duration.cast]/7:
// "template<class ToDuration, class Rep, class Period>
//   constexpr ToDuration ceil(const duration<Rep, Period>& d);
// Returns: The least result t representable in ToDuration for which t >= d."

int main() {
    long long micros_since_epoch = 1605893279777777;
    printf("%lld is the number of microseconds between Jan 1, 1970, and a moment on Nov 20, 2020\n", micros_since_epoch);
    printf("It is exactly representable in both double and long long.\n");
    printf("duration<double, micro> dblmicro20Nov20;\n");
    duration<double, micro> dblmicro20Nov20(micros_since_epoch);
    printf("dblmicro20Nov20.count(): %.1000g    %a\n", dblmicro20Nov20.count(), dblmicro20Nov20.count());
    printf("duration<long long, micro> llmicro20Nov20;\n");
    duration<long long, micro> llmicro20Nov20(micros_since_epoch);
    printf("llmicro20Nov20.count(): %lld\n", llmicro20Nov20.count());
    if( llmicro20Nov20 == dblmicro20Nov20 )
        printf("llmicro20Nov20 == dblmicro20Nov20.  I.e., they compare equal with C++ operator==\n");

    const auto ceildbl = ceil<nanoseconds>(dblmicro20Nov20);
    const auto ceilll  = ceil<nanoseconds>(llmicro20Nov20);
    cout << "\n";
    cout << "ceil<nanoseconds>(dblmicro20Nov20): " << ceildbl.count() << "\n";
    cout << "ceil<nanoseconds>(llmicro20Nov20) : " << ceilll.count() << "\n";

    // Search for the 'correct' answer:
    nanoseconds correct_ceil = ceildbl;
    while( correct_ceil >= dblmicro20Nov20 ){
        correct_ceil--;
    }
    correct_ceil++;
    // Let's check, just to be sure
    assert( correct_ceil >= dblmicro20Nov20 );
    assert( !(--correct_ceil >= dblmicro20Nov20 ));
    cout << "'correct' ceil has count          : " <<  correct_ceil.count() << "\n";

    const auto floordbl = floor<nanoseconds>(dblmicro20Nov20);
    const auto floorll  = floor<nanoseconds>(llmicro20Nov20);
    cout << "\n";
    cout << "floor<nanoseconds>(dblmicro20Nov20): " << floordbl.count() << "\n";
    cout << "floor<nanoseconds>(llmicro20Nov20) : " << floorll.count() << "\n";
    // Search for the 'correct' answer:
    auto correct_floor = floordbl;
    while( correct_floor <= dblmicro20Nov20 ){
        correct_floor++;
    }
    correct_floor--;
    // Let's check, just to be sure:
    assert( correct_floor <= dblmicro20Nov20 );
    assert( !(++correct_floor <= dblmicro20Nov20 ) );
    cout << "'correct' floor has count          : " <<  correct_floor.count() << "\n";

    if( correct_floor > correct_ceil )
        printf("Surprise!  correct_floor > correct_ceil\n");

    const auto rounddbl = round<nanoseconds>(dblmicro20Nov20);
    const auto roundll  = round<nanoseconds>(llmicro20Nov20);
    cout << "\n";
    cout << "round<nanoseconds>(dblmicro20Nov20): " << rounddbl.count() << "\n";
    cout << "round<nanoseconds>(llmicro20Nov20) : " << roundll.count() << "\n";
}

This is with g++ and libstdc++, but I'm pretty sure STL will do the same. And in any case, the surprising 'correct' values are compiler-independent.

dell$ g++ -std=c++20 dbp.cpp && ./a.out
1605893279777777 is the number of microseconds between Jan 1, 1970, and a moment on Nov 20, 2020
It is exactly representable in both double and long long.
duration<double, micro> dblmicro20Nov20;
dblmicro20Nov20.count(): 1605893279777777    0x1.6d234a9f40fc4p+50
duration<long long, micro> llmicro20Nov20;
llmicro20Nov20.count(): 1605893279777777
llmicro20Nov20 == dblmicro20Nov20.  I.e., they compare equal with C++ operator==

ceil<nanoseconds>(dblmicro20Nov20): 1605893279777776896
ceil<nanoseconds>(llmicro20Nov20) : 1605893279777777000
'correct' ceil has count          : 1605893279777776768

floor<nanoseconds>(dblmicro20Nov20): 1605893279777776896
floor<nanoseconds>(llmicro20Nov20) : 1605893279777777000
'correct' floor has count          : 1605893279777777024
Surprise!  correct_floor > correct_ceil

round<nanoseconds>(dblmicro20Nov20): 1605893279777776896
round<nanoseconds>(llmicro20Nov20) : 1605893279777777000
dell$
johnsalmon commented 3 years ago

The problems with ceil and floor are not even limited to floating point Reps. Defining them in terms of operator<= and operator>= leads to trouble even with integral Reps. E.g.,

    using D1 = duration<intmax_t, ratio<1,  6666666667>>;
    using D2 = duration<intmax_t, ratio<1, 13333333333>>;

    D1 d1{};
    D2 d2{};
    d2 <= d1; // Won't compile.  The denominator overflows.
    ceil<D2>(d1);  // Won't compile.  But is it even well-defined if d2<=d1 won't compile?

P.S. I apologize if this isn't the right place to discuss problems with the specification of floor and ceil. Please let me know if somewhere else would be preferable.

HowardHinnant commented 3 years ago

This is actually a feature of chrono, not a bug. And is not particular to the relational operators. Values of D1 and D2 can not be compared, added or subtracted. And the reason for this is because there exist no specialization of duration which can hold the values of the result. I.e. the common_type_t<D1, D2> can not be represented with ratio<N,D> and intmax_t (at least on this platform). Rather than silently overflow the duration::period, a compile-time error is required.

Fwiw, the denominator of the common_type of these two ratios would require 67 bits to represent, and intmax_t provides only 63.

Generally speaking, there is no general solution to this problem except using a dynamically allocating "BigNum" for the computations, which itself would be limited by available dynamic memory.

johnsalmon commented 3 years ago

Yes. I understand that D1 and D2 can't be added, subtracted or compared. But what does that imply for ceil<D2>(d1). If values of type D1 and D2 can't be compared, then how can ceil<D2>(d1) possibly return "the least result t representable in D2 for which t>=d1." There is no such t.

It seems to me that this is further evidence (together with the examples with float and double, above) that the specified behavior of ceil (and floor) is surprising and error-prone.

statementreply commented 3 years ago

I found the following bug - related enough to be mentioned here instead of in a separate issue: [...] This appears to be "just a bug" but we may want to wait to see whether ceil's specification will change. (The bug is that, when the result is floating-point, we can't just add 1. In fact, there is no requirement for whole-numberness.)

I found a much (by relative error) worse repro of this bug:

#include <chrono>
#include <iomanip>
#include <iostream>

using namespace std;
using namespace std::chrono;

int main() {
    const duration<double> dur_dbl = 1us;
    const auto dur_flt = ceil<duration<float>>(dur_dbl);
    cout << "dur_dbl.count(): " << setprecision(17) << dur_dbl.count() << "\n";
    cout << "dur_flt.count(): " << setprecision(9) << dur_flt.count() << "\n";
    return 0;
}
C:\Users\He\source\test>cl /EHsc /W4 /WX /std:c++17 chrono_ceil.cpp
用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.28.29828 版
版权所有(C) Microsoft Corporation。保留所有权利。

chrono_ceil.cpp
Microsoft (R) Incremental Linker Version 14.28.29828.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:chrono_ceil.exe
chrono_ceil.obj

C:\Users\He\source\test>.\chrono_ceil.exe
dur_dbl.count(): 9.9999999999999995e-07
dur_flt.count(): 1.00000095
jwakely commented 1 week ago

I found a much (by relative error) worse repro of this bug:

Yes, adding/subtracting 1 from a floating-point result is almost always wrong.

using DD = std::chrono::duration<double>;
using FD = std::chrono::duration<float>;
static_assert( std::chrono::floor<FD>(DD(0.1)) >= FD(0) ); // nope

floor returns FD(-0.9f) here, because 0.1f < 0.1 is false so it does 0.1f - 1.0f

Howard suggested that ceil and floor should not try to adjust anything up or down if treat_as_floating_point is true, e.g. ceil would be

  auto to = duration_cast<ToDuration>(from);
  if constexpr (!treat_as_floating_point_v<ToDuration>)
    if (to < from)
      ++to;
  return to

Maybe we should just specify exactly this code in the standard, instead of the not-practically-implementable "least result t representable as ToDuration for which t >= d" prose.

This still means ceil might return something lower than the mathematically correct value, but that's just how floating-point works. The result will be the closest value, as determined by the program's rounding mode.

P.S. we can't use std::nexttoward(to, HUGE_VAL) for ceil (and nexttoward -inf for floor) because those only work for float, double and long double, not for extended floating-point types, and not for user-defined types for which treat_as_floating_point_v<rep> is true. std::nextafter works for extended floating-point types, but still not for user-defined types.

HowardHinnant commented 1 week ago

I agree with Jonathan's comments and will only add: Whatever we do, we should do the same for all three of floor, ceil and round.