pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
29.82k stars 1.91k forks source link

Windowed `std` does not work correctly (non-deterministic, incorrect values) #17102

Closed erinov1 closed 3 months ago

erinov1 commented 3 months ago

Checks

Reproducible example

import polars as pl

c1 = pl.DataFrame({"A": [1, 1], "B": [1.0, 2.0]})
c2 = pl.DataFrame({"A": [2, 2], "B": [1.0, 2.0]})

df = pl.concat([c1, c2], rechunk=False)

df.select(pl.col("B").std().over("A").alias("std"))

Log output

No response

Issue description

The query df.select(pl.col("B").std().over("A").alias("std")) above returns a variety of incorrect, non-deterministic values, for example

shape: (4, 1)
┌────────────┐
│ std        │
│ ---        │
│ f64        │
╞════════════╡
│ 0.707107   │
│ 0.707107   │
│ 3.6788e-90 │
│ 3.6788e-90 │
└────────────┘

shape: (4, 1)
┌─────────────┐
│ std         │
│ ---         │
│ f64         │
╞═════════════╡
│ 0.707107    │
│ 0.707107    │
│ 4.2522e-154 │
│ 4.2522e-154 │
└─────────────┘

This is only an issue when multiple chunks are present. Rechunking fixes the issue. The query also works correctly if std() is replaced with var().pow(1/2), even for multiple chunks.

Expected behavior

The query should return the correct answer

shape: (4, 1)
┌──────────┐
│ std      │
│ ---      │
│ f64      │
╞══════════╡
│ 0.707107 │
│ 0.707107 │
│ 0.707107 │
│ 0.707107 │
└──────────┘

Installed versions

``` --------Version info--------- Polars: 0.20.31 Index type: UInt32 Platform: macOS-14.4.1-arm64-arm-64bit Python: 3.11.1 (main, Feb 8 2023, 16:32:14) [Clang 14.0.0 (clang-1400.0.29.202)] ----Optional dependencies---- adbc_driver_manager: cloudpickle: connectorx: deltalake: fastexcel: fsspec: 2023.12.2 gevent: hvplot: matplotlib: 3.7.4 nest_asyncio: 1.6.0 numpy: 1.24.4 openpyxl: pandas: 1.5.3 pyarrow: 12.0.1 pydantic: 1.10.14 pyiceberg: pyxlsb: sqlalchemy: 2.0.25 torch: xlsx2csv: xlsxwriter: ```
nameexhaustion commented 3 months ago

This panics on a debug build for OOB slice access:

   7: polars_arrow::legacy::kernels::take_agg::var::take_var_no_null_primitive_iter_unchecked::{{closure}}
             at ./crates/polars-arrow/src/legacy/kernels/take_agg/var.rs:58:26
Full backtrace ```rust thread 'polars-0' panicked at library/core/src/panicking.rs:219:5: unsafe precondition(s) violated: slice::get_unchecked requires that the index is within the slice stack backtrace: 0: rust_begin_unwind at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/std/src/panicking.rs:652:5 1: core::panicking::panic_nounwind_fmt::runtime at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/panicking.rs:110:18 2: core::panicking::panic_nounwind_fmt at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/panicking.rs:120:5 3: core::panicking::panic_nounwind at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/panicking.rs:219:5 4: >::get_unchecked::precondition_check at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/ub_checks.rs:68:21 5: >::get_unchecked at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/ub_checks.rs:75:17 6: core::slice::::get_unchecked at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/slice/mod.rs:686:20 7: polars_arrow::legacy::kernels::take_agg::var::take_var_no_null_primitive_iter_unchecked::{{closure}} at ./crates/polars-arrow/src/legacy/kernels/take_agg/var.rs:58:26 8: core::ops::function::impls:: for &mut F>::call_once at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/ops/function.rs:305:13 9: core::option::Option::map at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/option.rs:1075:29 10: as core::iter::traits::iterator::Iterator>::next at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/iter/adapters/map.rs:108:26 11: polars_arrow::legacy::kernels::take_agg::var::online_variance at ./crates/polars-arrow/src/legacy/kernels/take_agg/var.rs:23:18 12: polars_arrow::legacy::kernels::take_agg::var::take_var_no_null_primitive_iter_unchecked at ./crates/polars-arrow/src/legacy/kernels/take_agg/var.rs:62:5 13: polars_core::frame::group_by::aggregations::>>::agg_std::{{closure}} at ./crates/polars-core/src/frame/group_by/aggregations/mod.rs:849:25 14: core::ops::function::impls:: for &F>::call_mut at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/ops/function.rs:272:13 15: core::iter::adapters::map::map_try_fold::{{closure}} at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/iter/adapters/map.rs:96:28 16: core::iter::traits::iterator::Iterator::try_fold at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/iter/traits/iterator.rs:2411:21 17: as core::iter::traits::iterator::Iterator>::try_fold at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/iter/adapters/map.rs:122:9 18: as core::iter::traits::iterator::Iterator>::try_fold at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/iter/adapters/take_while.rs:94:13 19: as core::iter::traits::iterator::Iterator>::fold at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/iter/mod.rs:378:13 20: as rayon::iter::plumbing::Folder>::consume_iter at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-1.10.0/src/iter/fold.rs:158:20 21: as rayon::iter::plumbing::Folder>::consume_iter at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-1.10.0/src/iter/map.rs:248:21 22: rayon::iter::plumbing::Producer::fold_with at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-1.10.0/src/iter/plumbing/mod.rs:109:9 23: rayon::iter::plumbing::bridge_producer_consumer::helper at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-1.10.0/src/iter/plumbing/mod.rs:437:13 24: rayon::iter::plumbing::bridge_producer_consumer::helper::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-1.10.0/src/iter/plumbing/mod.rs:426:21 25: rayon_core::join::join_context::call_b::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/join/mod.rs:129:25 26: rayon_core::job::JobResult::call::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/job.rs:218:41 27: as core::ops::function::FnOnce<()>>::call_once at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/panic/unwind_safe.rs:272:9 28: std::panicking::try::do_call at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/std/src/panicking.rs:559:40 29: ___rust_try 30: std::panicking::try at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/std/src/panicking.rs:523:19 31: std::panic::catch_unwind at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/std/src/panic.rs:149:14 32: rayon_core::unwind::halt_unwinding at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/unwind.rs:17:5 33: rayon_core::job::JobResult::call at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/job.rs:218:15 34: as rayon_core::job::Job>::execute at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/job.rs:120:32 35: rayon_core::job::JobRef::execute at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/job.rs:64:9 36: rayon_core::registry::WorkerThread::execute at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:860:9 37: rayon_core::registry::WorkerThread::wait_until_cold at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:794:21 38: rayon_core::registry::WorkerThread::wait_until at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:769:13 39: rayon_core::join::join_context::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/join/mod.rs:167:17 40: rayon_core::registry::in_worker at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:951:13 41: rayon_core::join::join_context at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/join/mod.rs:132:5 42: rayon_core::join::join at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/join/mod.rs:105:5 43: rayon_core::thread_pool::ThreadPool::join::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/thread_pool/mod.rs:280:25 44: rayon_core::thread_pool::ThreadPool::install::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/thread_pool/mod.rs:147:40 45: rayon_core::registry::Registry::in_worker_cold::{{closure}}::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:522:21 46: rayon_core::job::JobResult::call::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/job.rs:218:41 47: as core::ops::function::FnOnce<()>>::call_once at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/core/src/panic/unwind_safe.rs:272:9 48: std::panicking::try::do_call at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/std/src/panicking.rs:559:40 49: ___rust_try 50: std::panicking::try at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/std/src/panicking.rs:523:19 51: std::panic::catch_unwind at /rustc/032af18af578f4283a2927fb43b90df2bbb72b67/library/std/src/panic.rs:149:14 52: rayon_core::unwind::halt_unwinding at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/unwind.rs:17:5 53: rayon_core::job::JobResult::call at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/job.rs:218:15 54: as rayon_core::job::Job>::execute at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/job.rs:120:32 55: rayon_core::job::JobRef::execute at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/job.rs:64:9 56: rayon_core::registry::WorkerThread::execute at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:860:9 57: rayon_core::registry::WorkerThread::wait_until_cold at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:794:21 58: rayon_core::registry::WorkerThread::wait_until at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:769:13 59: rayon_core::registry::WorkerThread::wait_until_out_of_work at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:818:9 60: rayon_core::registry::main_loop at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:923:5 61: rayon_core::registry::ThreadBuilder::run at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:53:18 62: ::spawn::{{closure}} at /Users/nxs/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-core-1.12.1/src/registry.rs:98:20 note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. thread caused non-unwinding panic. aborting. ```