NREL / flasc

A rich floris-driven suite for SCADA analysis
https://nrel.github.io/flasc/
BSD 3-Clause "New" or "Revised" License
31 stars 18 forks source link

Feature/port to polars 3 #102

Closed paulf81 closed 1 year ago

paulf81 commented 1 year ago

Not ready to be merged

Feature or improvement description This is a final reboot of the port_to_polars efforts, initially begin in Pull Request #81 and updated in Pull Request #97, here following the decision made in discussion #98 that we keep the assumption that pandas dataframe defined in the current methods (with columns pow_000, pow_001, and time) will be kept the same, and conversion to polars can happen within certain functions.

In connection with this effort is a small repo https://github.com/paulf81/flasc_metrics (based on https://github.com/rafmudaf/floris_metrics) which can be used to confirm that even with a pandas->polars->pandas round trip the functions are still sped up. The new folder timing_tests provides functions to that repo.

A first result is promising, converting the energy ratio with bootstrapping to the polars-based method reduces the average time to compute an N=20 bootstrap energy ratio over the artificial data set from 11.3344 seconds to 1.0762 seconds.

Related issue, if one exists Issue #80

Impacted areas of the software energy_ratio (this list may grow)

paulf81 commented 1 year ago

This pull request now superseded by #107