python / cpython

The Python programming language

https://www.python.org/

Other

61.4k stars 29.57k forks source link

Possible undefined behavior division by zero in complex's `_Py_c_pow()` #113841

Open gpshead opened 6 months ago

gpshead commented 6 months ago

Bug report

Bug description:

https://github.com/python/cpython/blob/main/Objects/complexobject.c#L150 _Py_c_pow() contains:

        at = atan2(a.imag, a.real);
        phase = at*b.real;
        if (b.imag != 0.0) {
            len /= exp(at*b.imag);

An oss-fuzz pycompiler fuzzer identified a problem compiling the code "9J**33J**3" within the optimizer folding the constant by doing the math in place. We haven't been able to reproduce this ourselves - but code inspection reveals a potential problem:

C exp(x) can return 0 in a couple of cases. Which would lead to the /= executing an undefined division by zero operation.

I believe the correct _Py_c_pow() code should look something like:

        if (b.imag != 0.0) {
            double tmp_exp = exp(at*b.imag);
            if (exp == 0.0 || errno == ERANGE) {
                // Detected as OverflowError by our caller.
                r.real = Py_HUGE_VAL;
                r.imag = Py_HUGE_VAL;
                return r;
            }
            len /= tmp_exp;

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

serhiy-storchaka commented 6 months ago

Why not use multiplication instead of division?

gpshead commented 6 months ago

The complex object code has been this way since it was added 28 years ago. https://github.com/python/cpython/commit/f9fca9252f64a93b4598615c504011fa385333ae 😅 and may have come from @khinsen.

For floating point things, it is often worth figuring out how error accumulates along the way in the computation to minimize loss. I haven't given the algorithm here any thought or looked up alternatives. My proposal is a mere edit that'd keep it the same with an added check in the middle.

gpshead commented 6 months ago

Also of note since complexobject.c was written... C99 added a complex double type and among the functions available for use with that is cpow()...

gpshead commented 6 months ago

which may be less trivial to use because visual studio support appears to be partial? https://developercommunity.visualstudio.com/t/support-for-c99-complex-numbers/1409049 -> https://learn.microsoft.com/en-us/cpp/c-runtime-library/complex-math-support?view=msvc-170

khinsen commented 6 months ago

Yes, that was me who contributed the complex object to CPython. All the algorithms were taken from C code snippets floating around in my environment - version control hadn't arrived in computational science yet! Since then, not only the C language has evolved, but also the optimization habits of C compilers. I wouldn't waste much effort figuring out why pow was implemented this way. It's more productive to consider how best to implement pow in today's C ecosystem.

serhiy-storchaka commented 6 months ago

I think that the result should be NaN if exp == 0.0 && len == 0.0.

Also, should not errno == ERANGE give zero or NaN, depending on len, as result?

serhiy-storchaka commented 6 months ago

cc @mdickinson