antoinecarme / pyaf

PyAF is an Open Source Python library for Automatic Time Series Forecasting built on top of popular pydata modules.
BSD 3-Clause "New" or "Revised" License
456 stars 73 forks source link

Pyaf 5.0 Final Touch 8 : Use an Optimal Choice Rule for the Quantization Signal transform #239

Closed antoinecarme closed 1 year ago

antoinecarme commented 1 year ago

When performing signal quantization, PyAF uses a set of bin numbers (Q=5, 10, 20). This is too slow. Use an optimal bin number selection rule (square root, diaconis, etc).

https://github.com/antoinecarme/pyaf/blob/4ab4a659801ba2d48ca519f605d6d97754879922/pyaf/TS/Options.py#L202

An easy rule for variable bin width, equi-frequency, is this one :

https://en.wikipedia.org/wiki/Histogram#Variable_bin_widths

The user choice, of course, will override this setting.

This issue only impacts slow modeling processes. Will not have impact by default.

antoinecarme commented 1 year ago

Optimal choice Detail : $Q = 2 N^{\frac{2}{5}}$

https://en.wikipedia.org/wiki/Histogram#Variable_bin_widths

image

antoinecarme commented 1 year ago

The new options parameter will server as a maximum for the number of bins (user-controllable).

self.mQuantiles = [ 20 ]; # vingtiles + use optimal rule

The optimal rule can give smaller values. The max will be 20 by default.

antoinecarme commented 1 year ago

Significant speedup : The number of threads used for quantization will go from 3 to 1.