Leave tests in, possible more perf tests should be done
Perf
Perf testing results for thousands of rows & columns:
1 year worth of data for this:
CargoTimeSeries().search(
timeseries_frequency='day',
timeseries_property='origin_terminal',
timeseries_activity='loading_end',
filter_activity='oil_on_water_state',
filter_time_min=datetime(2021, 1, 1),
filter_time_max=datetime(2021, 12, 31))
127.51475930213928 seconds -> mp pool
175 seconds -> no mp pool
pool with fix
(44543, 2571)
Time taken to convert to DataFrame: 38.43424606323242 seconds
nopool with fix
(44543, 2571)
Time taken to convert to DataFrame: 86.82875204086304 seconds
This still does not solve threadpool being slow for low dimensionality data. ie only 2 columns and only <100 or so records.
However, this improves one part of the code that was very slow, and disables the theadpool if only 1 core is available.
RELATED TICKETS
https://vortexa.atlassian.net/browse/RND-7233
CHANGELOG
TESTS
Perf
Perf testing results for thousands of rows & columns: