When the flu frequencies workflow gets to the fit_single_frequencies.py step, recent versions of polars throw a panic exception with the following error message:
$ python scripts/fit_single_frequencies.py --metadata data/vic/combined_na.tsv --geo-categories region --frequency-category clade --min-date 2021-01-01 --days 14 --inclusive-clades flu --output-csv results/vic_na/region-frequencies.csv
thread '<unnamed>' panicked at crates/polars-core/src/series/iterator.rs:74:9:
assertion `left == right` failed: impl error
left: 4
right: 1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
--- PyO3 is resuming a panic after fetching a PanicException from Python. ---
Python stack trace below:
Traceback (most recent call last):
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/expr/expr.py", line 3976, in __call__
result = self.function(*args, **kwargs)
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/expr/expr.py", line 4299, in wrap_f
return x.map_elements(
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/series/series.py", line 5270, in map_elements
self._s.apply_lambda(function, pl_return_dtype, skip_nulls)
pyo3_runtime.PanicException: assertion `left == right` failed: impl error
left: 4
right: 1
Traceback (most recent call last):
File "/Users/jlhudd/projects/nextflu-reports/who-2024-02/flu_frequencies/scripts/fit_single_frequencies.py", line 163, in <module>
data, totals, counts, time_bins = load_and_aggregate(d, args.geo_categories, freq_cat,
File "/Users/jlhudd/projects/nextflu-reports/who-2024-02/flu_frequencies/scripts/fit_single_frequencies.py", line 44, in load_and_aggregate
d = d.with_columns([pl.col('date').map_elements(lambda x: to_day_count(x, start_date)).alias("day_count")])
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/dataframe/frame.py", line 8270, in with_columns
return self.lazy().with_columns(*exprs, **named_exprs).collect(_eager=True)
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/lazyframe/frame.py", line 1730, in collect
return wrap_df(ldf.collect())
pyo3_runtime.PanicException: assertion `left == right` failed: impl error
left: 4
right: 1
I don't see any obvious changes to our input data between when this used to work and now. Downgrading polars to 0.20.3 allows the frequencies script to run without an error, suggesting that the issue first appeared in polars 0.20.4 (release Jan 12, 2024). This is all with Python 3.10.13 on an Intel Mac (OS version 12.6).
I confirmed that the error only occurs when calling the map_elements section of the failing expression above.
Possible solutions
As a band-aid, we could pin polars to 0.20.3 in the Conda environment.
Current Behavior
When the flu frequencies workflow gets to the
fit_single_frequencies.py
step, recent versions of polars throw a panic exception with the following error message:I don't see any obvious changes to our input data between when this used to work and now. Downgrading polars to 0.20.3 allows the frequencies script to run without an error, suggesting that the issue first appeared in polars 0.20.4 (release Jan 12, 2024). This is all with Python 3.10.13 on an Intel Mac (OS version 12.6).
I confirmed that the error only occurs when calling the
map_elements
section of the failing expression above.Possible solutions
As a band-aid, we could pin polars to 0.20.3 in the Conda environment.
As a longer-term solution, we might try to replace the officially discouraged
map_elements
call with a different approach.Or we could switch to pandas.