Open joriws opened 3 years ago
pip show adtk
Name: adtk
Version: 0.6.2
Summary: A package for unsupervised time series anomaly detection
Home-page: https://github.com/arundo/adtk
Author: Arundo Analytics, Inc.
Author-email: None
License: Mozilla Public License 2.0 (MPL 2.0)
Location: c:\users\guest\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages
Requires: numpy, scikit-learn, packaging, pandas, tabulate, matplotlib, statsmodels
Required-by:
Also tested with and no change of outcome:
for startano,endano in to_events(anomalies, freq_as_period=True, merge_consecutive=True):
Hi @joriws
From one user to another. This is probably not a adtk issue. Underlying pandas itself is not thread safe. https://pandas.pydata.org/pandas-docs/stable/user_guide/gotchas.html#thread-safety
I hope this helps in the future.
I fetch multiple timeseries data to Pandas DataFrame and validate_data and feed it to Pca_AD. Single threading serial execution worked fine, but with converting to threads to parallel execution on 3 parallel threads I get random result with anomalies-returned and drive to to_event casts TypeError. Plotting data is normal graph pattern and anomaly=anomalies plots normally, but to_events does not "complete". Between different runs different call to_events fails, like first dataset 3 then next run maybe 2 and 3 is ok. Third run could be that dataset 2 works but 1/3 are not.
I've tried also with threads.local() but it does not change anything. Without threading I did not observe this behaviour.
type is same for all <class 'pandas.core.series.Series'>
When checking the output for logging.error for some reason there is no "freq"-parameter, anomaly data which has freq works well. Also non-working returns time stamps not time ranges.
Non-working
Working structure.