functime-org / functime

Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.
https://docs.functime.ai
Apache License 2.0
975 stars 52 forks source link

perf: More tse optimisations #103

Closed abstractqqq closed 8 months ago

abstractqqq commented 8 months ago
  1. Benford Correlation is once again slightly faster now because I replaced value_counts by unique_counts. By putting pl.int_range(1, 10) in front, we do not need to sort, and unique_counts counts in the given order. In addition, unique counts is just slightly faster than value_counts because it only counts and does not return the corresponding values.
  2. Number Crossing. Simplified the computation and got a significant speed boost.
  3. percent_reoccurring_points,. Simplified the mathematical formula.
  4. sum_reoccurring_values. The original one-line implementation was elegant, but is_unique, unique, and filter are more expensive. We can directly do a group_by (value_counts), which cuts down the size of the series, and then filter.. This is 40-50% faster both when there are lots of unique values and when there are almost no unique values.
vercel[bot] commented 8 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
functime-docs ✅ Ready (Inspect) Visit Preview Oct 26, 2023 4:51am