Affine columns - Githubissues

Description

I work with time series data in Pandas, would love to move to Polars but need a way to replicate my use of pd.RangeIndex. I know that Polars doesn't have indices and support that decision, however, I would need a column with the equivalent features.

How I currently use pd.RangeIndex:

My code supports regularly and irregularly sampled time series
The index (in Polars it would be just a column) contains the values of the time axis in milliseconds
Irregularly sampled time series have an index of type pd.Index (in Polars this would be e.g. a column of integers)
Regularly sampled time series have an index of type pd.RangeIndex, where the step member depends on the sampling rate (e.g. step == 5 corresponds to 200 Hz, step == 2 corresponds to 500 Hz)
Taking slices of such data frames just works, because RangeIndex has a start parameter: y = x[5:] will just result in another RangeIndex with y.start == x.start + 5 * x.step

Proposed behaviour:

Add a new column type e.g. AffineColumn<dtype> with members start: dtype and step: dtype
The n'th element of such a column is start + n * step
The column is always sorted
Many functions can be optimized:
- Some return another AffineColumn: e.g. head, tail, gather, gather_every, take, take_every, slice, ...
- Some are a nop: e.g. sort, unique, ...
- Adding / subtracting AffineColumns: their starts and steps are added / subtracted

I think it's fairly common to have such columns in many different applications, storing them as start and step saves lots of memory and clearly documents intent.

pola-rs / polars

Affine columns #14200

Description