Closed severous closed 3 years ago
Hi, thanks for your question.
Recall that SMAP and MSL actually consist of multiple individual time-series (A-1, C-2, etc.), each of which has 24/54 (SMAP/MSL) one-hot encoded features and 1 continuous feature. Each of these time-series is very short (typically 1-3k timesteps), so most implementations (including ours) concatenate all these time-series in the time direction, creating one large time-series.
So, the dataset will "jump" up and down whenever it transitions to a new channel. This has some effects on the forecastings and reconstructions of the model:
These two steps are performed in adjust_anomaly_scores
and are only applied when dataset is MSL or SMAP.
_labeledanomalies.csv is used to get the length of each channel of the test set, while _smap/msl_trainmd.csv is used for the same purpose but for the train set.
Thanks for your briliiant work. I would like to know the purpose of
adjust_anomaly_scores
. Thanks for your time.