COMPAT/BLD: rolling failed on Arm64 and ppc64le Linux

fangchenli commented 3 years ago

The test pandas/tests/window/test_rolling.py::test_rolling_var_numerical_issues has failed on arm64 build.

_______________ test_rolling_var_numerical_issues[var-1-values0] _______________
[gw2] linux -- Python 3.7.9 /home/travis/miniconda3/envs/pandas-dev/bin/python
func = 'var', third_value = 1, values = [5e+33, 0, 0.5, 0.5, 2, 0]
    @pytest.mark.parametrize(
        ("func", "third_value", "values"),
        [
            ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]),
            ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]),
            ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]),
            ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]),
        ],
    )
    def test_rolling_var_numerical_issues(func, third_value, values):
        # GH: 37051
        ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1])
        result = getattr(ds.rolling(2), func)()
        expected = Series([np.nan] + values)
>       tm.assert_series_equal(result, expected)
pandas/tests/window/test_rolling.py:908: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_libs/testing.pyx:46: in pandas._libs.testing.assert_almost_equal
    cpdef assert_almost_equal(a, b,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>   raise_assert_detail(obj, msg, lobj, robj, index_values=index_values)
E   AssertionError: Series are different
E   
E   Series values are different (42.85714 %)
E   [index]: [0, 1, 2, 3, 4, 5, 6]
E   [left]:  [nan, 5e+33, 0.0, 0.0, 0.0, 0.0, 0.0]
E   [right]: [nan, 5e+33, 0.0, 0.5, 0.5, 2.0, 0.0]
pandas/_libs/testing.pyx:161: AssertionError
_______________ test_rolling_var_numerical_issues[std-1-values1] _______________
[gw2] linux -- Python 3.7.9 /home/travis/miniconda3/envs/pandas-dev/bin/python
func = 'std', third_value = 1
values = [7.071068e+16, 0, 0.7071068, 0.7071068, 1.414214, 0]
    @pytest.mark.parametrize(
        ("func", "third_value", "values"),
        [
            ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]),
            ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]),
            ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]),
            ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]),
        ],
    )
    def test_rolling_var_numerical_issues(func, third_value, values):
        # GH: 37051
        ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1])
        result = getattr(ds.rolling(2), func)()
        expected = Series([np.nan] + values)
>       tm.assert_series_equal(result, expected)
pandas/tests/window/test_rolling.py:908: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_libs/testing.pyx:46: in pandas._libs.testing.assert_almost_equal
    cpdef assert_almost_equal(a, b,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>   raise_assert_detail(obj, msg, lobj, robj, index_values=index_values)
E   AssertionError: Series are different
E   
E   Series values are different (42.85714 %)
E   [index]: [0, 1, 2, 3, 4, 5, 6]
E   [left]:  [nan, 7.071067811865475e+16, 0.0, 0.0, 0.0, 0.0, 0.0]
E   [right]: [nan, 7.071068e+16, 0.0, 0.7071068, 0.7071068, 1.414214, 0.0]
pandas/_libs/testing.pyx:161: AssertionError
_______________ test_rolling_var_numerical_issues[var-2-values2] _______________
[gw2] linux -- Python 3.7.9 /home/travis/miniconda3/envs/pandas-dev/bin/python
func = 'var', third_value = 2, values = [5e+33, 0.5, 0, 0.5, 2, 0]
    @pytest.mark.parametrize(
        ("func", "third_value", "values"),
        [
            ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]),
            ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]),
            ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]),
            ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]),
        ],
    )
    def test_rolling_var_numerical_issues(func, third_value, values):
        # GH: 37051
        ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1])
        result = getattr(ds.rolling(2), func)()
        expected = Series([np.nan] + values)
>       tm.assert_series_equal(result, expected)
pandas/tests/window/test_rolling.py:908: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_libs/testing.pyx:46: in pandas._libs.testing.assert_almost_equal
    cpdef assert_almost_equal(a, b,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>   raise_assert_detail(obj, msg, lobj, robj, index_values=index_values)
E   AssertionError: Series are different
E   
E   Series values are different (42.85714 %)
E   [index]: [0, 1, 2, 3, 4, 5, 6]
E   [left]:  [nan, 5e+33, 0.0, 0.0, 0.0, 0.0, 0.0]
E   [right]: [nan, 5e+33, 0.5, 0.0, 0.5, 2.0, 0.0]
pandas/_libs/testing.pyx:161: AssertionError
_______________ test_rolling_var_numerical_issues[std-2-values3] _______________
[gw2] linux -- Python 3.7.9 /home/travis/miniconda3/envs/pandas-dev/bin/python
func = 'std', third_value = 2
values = [7.071068e+16, 0.7071068, 0, 0.7071068, 1.414214, 0]
    @pytest.mark.parametrize(
        ("func", "third_value", "values"),
        [
            ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]),
            ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]),
            ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]),
            ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]),
        ],
    )
    def test_rolling_var_numerical_issues(func, third_value, values):
        # GH: 37051
        ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1])
        result = getattr(ds.rolling(2), func)()
        expected = Series([np.nan] + values)
>       tm.assert_series_equal(result, expected)
pandas/tests/window/test_rolling.py:908: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_libs/testing.pyx:46: in pandas._libs.testing.assert_almost_equal
    cpdef assert_almost_equal(a, b,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>   raise_assert_detail(obj, msg, lobj, robj, index_values=index_values)
E   AssertionError: Series are different
E   
E   Series values are different (42.85714 %)
E   [index]: [0, 1, 2, 3, 4, 5, 6]
E   [left]:  [nan, 7.071067811865475e+16, 0.0, 0.0, 0.0, 0.0, 0.0]
E   [right]: [nan, 7.071068e+16, 0.7071068, 0.0, 0.7071068, 1.414214, 0.0]
pandas/_libs/testing.pyx:161: AssertionError

mgorny commented 3 years ago

@fangchenli, I can reproduce the exact same problem on amd64. It seems to be dependent on CXXFLAGS -- it happens with -march=znver2 but not with whatever default gcc has.

fangchenli commented 3 years ago

@mgorny could you give more info about your OS and CPU? I couldn't reproduce this on Intel Mac with the -march=znver2 flag. I can only reproduce it on Raspberry Pi (Ubuntu 64bit), but I don't see this on mbp M1. So it might not be an ARM issue.

mgorny commented 3 years ago

It's Gentoo Linux.

$ lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   43 bits physical, 48 bits virtual
CPU(s):                          12
On-line CPU(s) list:             0-11
Thread(s) per core:              2
Core(s) per socket:              6
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           113
Model name:                      AMD Ryzen 5 3600 6-Core Processor
Stepping:                        0
Frequency boost:                 enabled
CPU MHz:                         2200.000
CPU max MHz:                     4714,4531
CPU min MHz:                     2200,0000
BogoMIPS:                        7189.53
Virtualization:                  AMD-V
L1d cache:                       192 KiB
L1i cache:                       192 KiB
L2 cache:                        3 MiB
L3 cache:                        32 MiB
NUMA node0 CPU(s):               0-11
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full AMD retpoline, IBPB conditional, STIBP conditional, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx m
                                 mxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqd
                                 q monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_
                                 legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_
                                 llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb stibp vmmcall sev_es fsgsbase bmi1 avx2 smep bmi2 cq
                                 m rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total
                                  cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flus
                                 hbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca

mgorny commented 3 years ago

~~I'm sorry, I wasn't correct. -march= is insufficient to reproduce it. I'll have the right flag in a few minutes.~~

mgorny commented 3 years ago

Scratch that. I didn't notice the file's in C++ and I didn't update CXXFLAGS. Apparently you need to have -march=znver2 and -O2 or -Os in CXXFLAGS. Did you include both on your amd64 system?

In any case, I've been able to pin it down to -mfma, i.e. can reproduce it with -march=x86-64 -O2 -mfma.

My results:

FAIL -march=znver2 -O2 -pipe
FAIL -march=znver2 -O2
PASS -march=znver2 -O1
PASS -march=znver2 -Og
# -O0 fails to build with irrelevant linking error
FAIL -march=znver2 -Os
FAIL -march=znver1 -O2
PASS -march=btver2 -O2
# -march=bdver* SIGILLs on me
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mrdseed -mmwaitx -msha -mclzero -mmovbe -mclflushopt -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mrdseed -mmwaitx -msha -mclzero -mmovbe -mclflushopt -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mrdseed -mmwaitx -msha -mclzero -mmovbe -mclflushopt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mrdseed -mmwaitx -msha -mclzero -mmovbe -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mrdseed -mmwaitx -msha -mclzero -mclflushopt -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mrdseed -mmwaitx -msha -mmovbe -mclflushopt -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mrdseed -mmwaitx -mclzero -mmovbe -mclflushopt -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mrdseed -msha -mclzero -mmovbe -mclflushopt -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mavx2 -mmwaitx -msha -mclzero -mmovbe -mclflushopt -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mfsgsbase -mrdseed -mmwaitx -msha -mclzero -mmovbe -mclflushopt -mpopcnt
FAIL -march=btver2 -O2 -mbmi -mbmi2 -mfma -mavx2 -mrdseed -mmwaitx -msha -mclzero -mmovbe -mclflushopt -mpopcnt
PASS -march=btver2 -O2 -mbmi -mbmi2 -mfsgsbase -mavx2 -mrdseed -mmwaitx -msha -mclzero -mmovbe -mclflushopt -mpopcnt
PASS -march=znver2 -O2 -mno-fma
FAIL -march=x86-64 -O2 -mfma

mgorny commented 3 years ago

Not sure how much help is that but I was able to make pandas/tests/window/test_rolling.py::test_rolling_var_numerical_issues[var-1-values0] by adding __attribute__((target("no-fma"))) to the following three Cython-generated functions:

__pyx_f_6pandas_5_libs_6window_12aggregations_remove_var
__pyx_pw_6pandas_5_libs_6window_12aggregations_5roll_var
__pyx_pf_6pandas_5_libs_6window_12aggregations_4roll_var

So I guess the remove_var and roll_var functions are affected.

fangchenli commented 3 years ago

@mgorny Thank you so much for this input.

@pandas-dev/pandas-core Any thought on this?

bashtage commented 3 years ago

Does it always fail? Wondering if these expect a particular alignment.

mgorny commented 3 years ago

Yes, it fails reliably. I don't think it's alignment-related, as it produces the wrong result rather than crashing. I'm wondering if it could be a GCC optimization bug.

bashtage commented 3 years ago

It definitely sounds like a compiler bug. It is unfortunately really hard to do things like add pragmas or ifdefs in Cython from what I can tell. It might be possible to tell compiler to disable optimizations in these functions using C99 Pragma_ but these are heavy handed. Probably simpler to just pass specific compiler flags for this file on ARM so that it is possible less aggressively optimized (e.g. ,-O1).

bashtage commented 3 years ago

It definitely sounds like a compiler bug. It is unfortunately really hard to do things like add pragmas or ifdefs in Cython from what I can tell. It might be possible to tell compiler to disable optimizations in these functions using C99 Pragma_ but these are heavy handed. Probably easiest to just pass specific compiler flags for this file on ARM so that it is possible less aggressively optimized (e.g. ,O1).

bashtage commented 3 years ago

Anyone want to build with clang with the same flags rather than gcc to see if it reproduces?

mgorny commented 3 years ago

Will do in a minute. I can also try gcc-9, in case version matters (I'm running gcc 10.2.0).

mgorny commented 3 years ago

gcc-9.3.0 suffers the same problem, clang++ too. However, note that on clang I had to pass -march=znver2, -march=x86-64 -mfma did not reproduce it.

Note that I'm only testing on AMD64. I don't know if AArch64 implements FMA equivalent (as part of NEON, maybe?). But it smells a bit suspicious that two compilers would have the same bug on two different architectures.

mgorny commented 3 years ago

Heh, I've just looked at bug #37051 and now I feel stupid. It's not a bug but actually a 'bugfix'. FWICS the test is supposed to test for artifacts due to precision loss, so it obviously fails when the calculations are done with a better precision which is probably what's happening here.

Independently of compiler flags for pandas, I can reproduce it with:

diff --git a/pandas/_libs/window/aggregations.pyx b/pandas/_libs/window/aggregations.pyx
index efacfad..184e666 100644
--- a/pandas/_libs/window/aggregations.pyx
+++ b/pandas/_libs/window/aggregations.pyx
@@ -2,7 +2,7 @@

 import cython

-from libc.math cimport round
+from libc.math cimport round, fma
 from libcpp.deque cimport deque

 import numpy as np
@@ -333,7 +333,8 @@ cdef inline void remove_var(float64_t val, float64_t *nobs, float64_t *mean_x,
             compensation[0] = t + mean_x[0] - y
             delta = t
             mean_x[0] = mean_x[0] - delta / nobs[0]
-            ssqdm_x[0] = ssqdm_x[0] - (val - prev_mean) * (val - mean_x[0])
+            # ssqdm_x[0] = ssqdm_x[0] - (val - prev_mean) * (val - mean_x[0])
+            ssqdm_x[0] = fma(prev_mean - val, val - mean_x[0], ssqdm_x[0])
         else:
             mean_x[0] = 0
             ssqdm_x[0] = 0

i.e. this is clearly due to FMA application here, and the compiler is applying it correctly.

fangchenli commented 3 years ago

So I guess it's better to use FMA explicitly in the code rather than relying on optimization.

This test was added in #37055. @phofl

jakirkham commented 1 year ago

We also have seen issues with this test on Linux ppc64le. Would it make sense to update the title here to reflect both architectures have issues?

xref: https://github.com/conda-forge/pandas-feedstock/issues/149 xref: https://github.com/pandas-dev/pandas/pull/50349

jakirkham commented 1 year ago

FWIW just reproduced this on macOS M1. So think it may be a general ARM issue.

Jython1415 commented 1 year ago

Tests failed on an M2 MacBook Air on latest commit to main: 5cf5c73

System Info

Model Name: MacBook Air Model Identifier: Mac14,2 Chip: Apple M2 Total Number of Cores: 8 (4 performance and 4 efficiency) Memory: 16 GB System Firmware Version: 8422.121.1 OS Loader Version: 8422.121.1 System Version: macOS 13.4 (22F66) Kernel Version: Darwin 22.5.0 Secure Virtual Memory: Enabled System Integrity Protection: Enabled

Pytest Output

```text _______________________________________________________ test_rolling_var_numerical_issues[var-1-values0] _______________________________________________________ func = 'var', third_value = 1, values = [5e+33, 0, 0.5, 0.5, 2, 0] @pytest.mark.xfail( (is_platform_arm() and not is_platform_mac()) or is_platform_power(), reason="GH 38921", ) @pytest.mark.parametrize( ("func", "third_value", "values"), [ ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]), ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]), ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]), ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]), ], ) def test_rolling_var_numerical_issues(func, third_value, values): # GH: 37051 ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1]) result = getattr(ds.rolling(2), func)() expected = Series([np.nan] + values) > tm.assert_series_equal(result, expected) pandas/tests/window/test_rolling.py:1209: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ testing.pyx:55: in pandas._libs.testing.assert_almost_equal ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E AssertionError: Series are different E E Series values are different (42.85714 %) E [index]: [0, 1, 2, 3, 4, 5, 6] E [left]: [nan, 5e+33, 0.0, -2.7212384506478592e+17, -2.7212384506478592e+17, -2.7212384506478592e+17, 0.0] E [right]: [nan, 5e+33, 0.0, 0.5, 0.5, 2.0, 0.0] E At positional index 3, first diff: -2.7212384506478592e+17 != 0.5 testing.pyx:173: AssertionError _______________________________________________________ test_rolling_var_numerical_issues[std-1-values1] _______________________________________________________ func = 'std', third_value = 1, values = [7.071068e+16, 0, 0.7071068, 0.7071068, 1.414214, 0] @pytest.mark.xfail( (is_platform_arm() and not is_platform_mac()) or is_platform_power(), reason="GH 38921", ) @pytest.mark.parametrize( ("func", "third_value", "values"), [ ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]), ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]), ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]), ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]), ], ) def test_rolling_var_numerical_issues(func, third_value, values): # GH: 37051 ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1]) result = getattr(ds.rolling(2), func)() expected = Series([np.nan] + values) > tm.assert_series_equal(result, expected) pandas/tests/window/test_rolling.py:1209: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ testing.pyx:55: in pandas._libs.testing.assert_almost_equal ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E AssertionError: Series are different E E Series values are different (42.85714 %) E [index]: [0, 1, 2, 3, 4, 5, 6] E [left]: [nan, 7.071067811865475e+16, 0.0, 0.0, 0.0, 0.0, 0.0] E [right]: [nan, 7.071068e+16, 0.0, 0.7071068, 0.7071068, 1.414214, 0.0] E At positional index 3, first diff: 0.0 != 0.7071068 testing.pyx:173: AssertionError _______________________________________________________ test_rolling_var_numerical_issues[var-2-values2] _______________________________________________________ func = 'var', third_value = 2, values = [5e+33, 0.5, 0, 0.5, 2, 0] @pytest.mark.xfail( (is_platform_arm() and not is_platform_mac()) or is_platform_power(), reason="GH 38921", ) @pytest.mark.parametrize( ("func", "third_value", "values"), [ ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]), ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]), ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]), ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]), ], ) def test_rolling_var_numerical_issues(func, third_value, values): # GH: 37051 ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1]) result = getattr(ds.rolling(2), func)() expected = Series([np.nan] + values) > tm.assert_series_equal(result, expected) pandas/tests/window/test_rolling.py:1209: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ testing.pyx:55: in pandas._libs.testing.assert_almost_equal ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E AssertionError: Series are different E E Series values are different (42.85714 %) E [index]: [0, 1, 2, 3, 4, 5, 6] E [left]: [nan, 5e+33, -2.7212384506478592e+17, 0.0, -2.7212384506478592e+17, -2.7212384506478592e+17, 0.0] E [right]: [nan, 5e+33, 0.5, 0.0, 0.5, 2.0, 0.0] E At positional index 2, first diff: -2.7212384506478592e+17 != 0.5 testing.pyx:173: AssertionError _______________________________________________________ test_rolling_var_numerical_issues[std-2-values3] _______________________________________________________ func = 'std', third_value = 2, values = [7.071068e+16, 0.7071068, 0, 0.7071068, 1.414214, 0] @pytest.mark.xfail( (is_platform_arm() and not is_platform_mac()) or is_platform_power(), reason="GH 38921", ) @pytest.mark.parametrize( ("func", "third_value", "values"), [ ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]), ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]), ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]), ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]), ], ) def test_rolling_var_numerical_issues(func, third_value, values): # GH: 37051 ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1]) result = getattr(ds.rolling(2), func)() expected = Series([np.nan] + values) > tm.assert_series_equal(result, expected) pandas/tests/window/test_rolling.py:1209: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ testing.pyx:55: in pandas._libs.testing.assert_almost_equal ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E AssertionError: Series are different E E Series values are different (42.85714 %) E [index]: [0, 1, 2, 3, 4, 5, 6] E [left]: [nan, 7.071067811865475e+16, 0.0, 0.0, 0.0, 0.0, 0.0] E [right]: [nan, 7.071068e+16, 0.7071068, 0.0, 0.7071068, 1.414214, 0.0] E At positional index 2, first diff: 0.0 != 0.7071068 testing.pyx:173: AssertionError ```

The test has the follower mark:

@pytest.mark.xfail(
    (is_platform_arm() and not is_platform_mac()) or is_platform_power(),
    reason="GH 38921",
)
@pytest.mark.parametrize(
    ("func", "third_value", "values"),
    [
        ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]),
        ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]),
        ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]),
        ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]),
    ],
)
def test_rolling_var_numerical_issues(func, third_value, values):

The not is_platform_mac() was added in #41982. If this error can be reproduced on another M2, perhaps this should be marked xfail for Macs as well?

topper-123 commented 1 year ago

I get everything green while running pytest pandas/tests/window/test_rolling.py -n 7 on a M1 Macbook Air.

Jython1415 commented 1 year ago

Thank you for checking this on your system as well. I just ran the same command, pytest pandas/tests/window/test_rolling.py -n 7 (and with -n auto as well), and got the same error. I'm not exactly sure what these tests are for... do you think you could help me understand what the issue is, and if it might be on my end?

Test result for test_rolling_var_numerical_issues[var-1-values0]

``` _______________________________________________________ test_rolling_var_numerical_issues[var-1-values0] _______________________________________________________ [gw4] darwin -- Python 3.10.11 /Users/Joshua/mambaforge/envs/pandas-dev/bin/python3.10 func = 'var', third_value = 1, values = [5e+33, 0, 0.5, 0.5, 2, 0] @pytest.mark.xfail( (is_platform_arm() and not is_platform_mac()) or is_platform_power(), reason="GH 38921", ) @pytest.mark.parametrize( ("func", "third_value", "values"), [ ("var", 1, [5e33, 0, 0.5, 0.5, 2, 0]), ("std", 1, [7.071068e16, 0, 0.7071068, 0.7071068, 1.414214, 0]), ("var", 2, [5e33, 0.5, 0, 0.5, 2, 0]), ("std", 2, [7.071068e16, 0.7071068, 0, 0.7071068, 1.414214, 0]), ], ) def test_rolling_var_numerical_issues(func, third_value, values): # GH: 37051 ds = Series([99999999999999999, 1, third_value, 2, 3, 1, 1]) result = getattr(ds.rolling(2), func)() expected = Series([np.nan] + values) > tm.assert_series_equal(result, expected) pandas/tests/window/test_rolling.py:1209: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ testing.pyx:55: in pandas._libs.testing.assert_almost_equal ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E AssertionError: Series are different E E Series values are different (42.85714 %) E [index]: [0, 1, 2, 3, 4, 5, 6] E [left]: [nan, 5e+33, 0.0, -2.7212384506478592e+17, -2.7212384506478592e+17, -2.7212384506478592e+17, 0.0] E [right]: [nan, 5e+33, 0.0, 0.5, 0.5, 2.0, 0.0] E At positional index 3, first diff: -2.7212384506478592e+17 != 0.5 testing.pyx:173: AssertionError ```

Other relevant info that I could find:

This test was introduced in #37055.
The flag including Macs was added in the PR linked in my previous comment to address this issue (see above discussion).

topper-123 commented 1 year ago

It's a bit difficult to say, from your pytest logs it's not obvious that it has anything to do with the mac chips.

You could try if you have some package that needs updating, or you need to recompile the cython modules. Sorry for not being very helpful.

Jython1415 commented 1 year ago

No worries! Thank you for taking a look at it anyway.

I rm -r'ed the build directory and ran python setup.py build_ext -j 4 and python -m pip install -e . --no-build-isolation --no-use-pep517 to build and install pandas according to the contributor guide. The --no-use-pep517 have me the following error:

ERROR: Disabling PEP 517 processing is invalid: project specifies a build backend of mesonpy in pyproject.toml

I ran it without the --no-use-pep517 option. I'm not sure if this has anything to do with the failing tests. I am now on 2.1.0.dev0+958.g5780d4f8f1 and I am still failing all 4 tests for test_rolling_var_numerical_issues. Any guidance would be appreciated.

phofl commented 1 year ago

This is related to Mac chips (I am seeing the same failures). I agree that we should skip those tests for Mac (I didn't care enough in the past to put up a pr myself)

The test itself is correct, the Mac compilers remove one optimisation step when compiling which introduces numerical instabilities

WillAyd commented 1 year ago

Is this really a bug or just bad input data? Isn't the input data going beyond any reasonable precision limits for floats?

phofl commented 1 year ago

The idea is that the previous values don't impact windows that come afterwards and don't have these values anymore. So yes this makes sense

fangchenli commented 1 year ago

I get everything green while running pytest pandas/tests/window/test_rolling.py -n 7 on a M1 Macbook Air.

I got this test fail on M1 Macbook Pro.

topper-123 commented 1 year ago

I do actually get the bugs now.

I think the difference is that in my last comment I built using python setup.py build_ext --inplace -j4 while now I've updated the build system to use meson and build using python -m pip install -ve . --no-build-isolation --config-settings editable-verbose=true.

phofl commented 1 year ago

Can you double check this? I got them with setuptools as well

topper-123 commented 1 year ago

It wasn't just the build method, but also the environment. I changed the environment when switching to meson. This is the old environment below.

INSTALLED VERSIONS ------------------ commit : 43f1bc8fb6d7d57525c1d251947ab4e83dbc2e54 python : 3.9.16.final.0 python-bits : 64 OS : Darwin OS-release : 22.3.0 Version : Darwin Kernel Version 22.3.0: Mon Jan 30 20:39:35 PST 2023; root:xnu-8792.81.3~2/RELEASE_ARM64_T8103 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : None LOCALE : None.UTF-8 pandas : 2.1.0.dev0+874.g43f1bc8fb6 numpy : 1.23.4 pytz : 2023.3 dateutil : 2.8.2 setuptools : 65.6.3 pip : 22.3.1 Cython : 0.29.33 pytest : 7.1.2 hypothesis : 6.37.0 sphinx : 5.0.2 blosc : 1.10.6 feather : None xlsxwriter : 3.0.3 lxml.etree : 4.9.2 html5lib : 1.1 pymysql : 1.0.2 psycopg2 : 2.9.3 jinja2 : 3.1.2 IPython : 8.9.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : 1.3.5 brotli : fastparquet : 0.8.3 fsspec : 2022.11.0 gcsfs : 2022.11.0 matplotlib : 3.7.1 numba : 0.56.4 numexpr : 2.8.4 odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : 12.0.0 pyreadstat : None pyxlsb : 1.0.10 s3fs : 2022.11.0 scipy : 1.9.1 snappy : None sqlalchemy : 1.4.43 tables : 3.7.0 tabulate : 0.9.0 xarray : 2023.4.2.dev3+gc75ac8b7 xlrd : 2.0.1 zstandard : 0.18.0 tzdata : 2023.3 qtpy : 2.2.0 pyqt5 : None I haven't checked what's making the difference.

Jython1415 commented 1 year ago

So currently we have:

M1 with setuptools: ??
M1 with meson fails
M2 with meson fails (just confirmed on my device)

I'd like to try the setuptools build on M2 to see what happens there. @topper-123, how did you reset your environment and build folder for that? I don't want to miss a step...

topper-123 commented 1 year ago

how did you reset your environment and build folder for that?

The old environment still exists, it's just that I couldn't get it to build with meson, so I created a completely new one. In the new environment I get the failures both with meson and (after doing git checkout 43f1bc8fb6d7d) with python setup.py. In the old environment I didn't get the failures after building with python setup.py and I'm not able to build/compile pandas with meson, so not able to run the tests with a meson build.

If you're interested I could provide some logs from the old environment from the meson build failure. I created the new environment because I gave up trying to figure out why it was failing, so I don't think I can explain what was happening, but you're welcome to take a look.

WillAyd commented 1 year ago

If meson and setuptools yield different results then they must not be compiling the same way. Can you check the flags sent to gcc in both methods and see what may be different? Also are there any build warnings?

topper-123 commented 1 year ago

I am not proficient in gcc or gcc flags, or what to look for in the gcc logs, sorry. If you can give some hints what to look for I could give a shot, but not sure I will make sense of it.

WillAyd commented 1 year ago

No problem. I'm pretty sure that this test hits the aggregations.pyx file in pandas/_libs/window/aggregations.pyx . So you are looking for a gcc command that generates aggregations.o from the Cython-generated aggregations.cpp file

setuptools prints the commands it uses for compilation directly to stdout. So when you do the traditional setup.py build_ext look for something like this in your terminal:

/home/willayd/mambaforge/envs/pandas-dev/bin/x86_64-conda-linux-gnu-cc -Wno-unused-result -Wsign-compare 
-DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/willayd/mambaforge/envs/pandas-dev/include -fPIC -O2 
-isystem /home/willayd/mambaforge/envs/pandas-dev/include -march=nocona -mtune=haswell -ftree-vectorize -fPIC 
-fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe 
-isystem /home/willayd/mambaforge/envs/pandas-dev/include -DNDEBUG -D_FORTIFY_SOURCE=2 
-O2 -isystem /home/willayd/mambaforge/envs/pandas-dev/include -fPIC -DNPY_NO_DEPRECATED_API=0 
-Ipandas/_libs/include -I/home/willayd/mambaforge/envs/pandas-dev/lib/python3.10/site-packages/numpy/core/include 
-I/home/willayd/mambaforge/envs/pandas-dev/include/python3.10 -c pandas/_libs/window/aggregations.cpp 
-o build/temp.linux-x86_64-cpython-310/pandas/_libs/window/aggregations.o

In the above command you can see I am using the x86_64-conda-linux-gnu-cc compiler and passing it compilation options like -O2, -Wall, -fPIC, fwrapv, etc... If you cared to see what these represented the gcc documentation is pretty good (and I think also describes clang behavior most of the time):

https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html

As far as meson is concerned, it doesn't print out the gcc command by default, but you can specify the compile-args as verbose with --config-settings=compile-args="--verbose" @lithomas1 might even know a better way

As a complete example, you can try python -m pip install -ve . --no-build-isolation --config-settings editable-verbose=true --config-settings=compile-args="--verbose" and again check stdout for the compilation command issued for aggregations.pyx. Here is what I get locally for comparison:

/home/willayd/mambaforge/envs/pandas-dev/bin/x86_64-conda-linux-gnu-c++ -shared -Wl,--allow-shlib-undefined 
-Wl,-rpath,/home/willayd/mambaforge/envs/pandas-dev/lib -Wl,-rpath-link,/home/willayd/mambaforge/envs/pandas-dev/lib 
-L/home/willayd/mambaforge/envs/pandas-dev/lib -Wl,--allow-shlib-undefined
 -Wl,-rpath,/home/willayd/mambaforge/envs/pandas-dev/lib -Wl,-rpath-link,/home/willayd/mambaforge/envs/pandas-dev/lib 
-L/home/willayd/mambaforge/envs/pandas-dev/lib -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,
-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined 
-Wl,-rpath,/home/willayd/mambaforge/envs/pandas-dev/lib 
-Wl,-rpath-link,/home/willayd/mambaforge/envs/pandas-dev/lib 
-L/home/willayd/mambaforge/envs/pandas-dev/lib -march=nocona -mtune=haswell -ftree-vectorize 
-fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe 
-isystem /home/willayd/mambaforge/envs/pandas-dev/include -DNDEBUG 
-D_FORTIFY_SOURCE=2 -O2 
-isystem /home/willayd/mambaforge/envs/pandas-dev/include build/temp.linux-x86_64-cpython-310/pandas/_libs/window/aggregations.o 
-o build/lib.linux-x86_64-cpython-310/pandas/_libs/window/aggregations.cpython-310-x86_64-linux-gnu.so

Arguments that follow the pattern -Wl,<something> are arguments to the linker; I would be very surprised if those make a difference; ignoring those you can see above that the arguments -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe are shared across setuptools / meson on my computer, so I wouldn't expect any differences. Guessing this may not hold true on your platform if the two compilation methods yield different results

Re: build warnings, definitely keep an out eye for them. If you are compiling serially then the warning should appear directly below the compilation command being used, but in parallel compilation you may need to look further down the output.

When in doubt, paste the entire output of the compilations and we can look at them here

WillAyd commented 1 year ago

Actually an even easier way to see what meson did is to look at the compile_commands.json file that it places in your build folder. That is persisted on disk and has the added benefit or removing linker arguments from the output, so its easier to read than what gets written to stdout

phofl commented 1 year ago

I have this failing for months, this is not solely related to meson.

markdryan commented 4 months ago

This test is failing on riscv64 as well. I see the failures on a VisionFive2 running Ubuntu 23.10 and also on qemu-user. I've submitted a patch to add riscv64 to the list of platforms for which the test is disabled.

pandas-dev / pandas

COMPAT/BLD: rolling failed on Arm64 and ppc64le Linux #38921