Closed geoHeil closed 3 years ago
The short answer is yes.
from pytorch_forecasting import TimeSeriesDataSet
dataset = TimeSeriesDataSet(
df.assign(is_anomaly=0, time_idx=lambda x: ((x.hour - x.hour.min()).dt.seconds / 3600).astype(int)),
max_encoder_length=0, # do not encode
max_prediction_length=1, # guess you want to set this to a higher number and also specify min_prediction_length
group_ids=["cohort_id", "device_id"],
time_idx="time_idx",
target="is_anomaly",
time_varying_known_reals=["metrik_0", "metrik_1", "metrik_2"],
)
# test the dataloader
x, y = next(iter(dataset.to_dataloader()))
x.keys()
x["decoder_cont"]
Now you can use x["decoder_cont"]
to train your autoencoder and later for validation use is_anomaly later to check if the anomaly detection works.
One potential downside you might want to be aware of is that metrik_0, metrik_1 and metrik_2 are z-score normalized accross all values. You can normalize by each timeseries by using
a GroupNormalizer but, currently, there is no way to normalize each subsequence based on on their own. However, of course, you could do that in a PyTorch Module yourself.
Many thanks!
Does max_prediction_length
correspond to the window length of data fed to the autoencoder?
For generating labels you used: df.assign(is_anomaly=0, time_idx=lambda x: ((x.hour - x.hour.min()).dt.seconds / 3600).astype(int)),
, I could create a dataset from true (but noisy labels) instead.
Why do you chooose: max_encoder_length=0, # do not encode
to set it to 0?
Thanks for pointing this out:
One potential downside you might want to be aware of is that metrik_0, metrik_1 and metrik_2 are z-score normalized accross > all values. You can normalize by each timeseries by using a GroupNormalizer but, currently, there is no way to normalize each subsequence based on on their own. However, of course, > you could do that in a PyTorch Module yourself.
could you link a line where I would need to start digging from / changing the code / integrating a custom module?
I have not leveraged PyTorch Forecasting for autoencoders, but you should definitely be able to do so.
Resouces to look at are current model implementations and the BaseModel
I think if you want to use PyTorch Forecasting all the way, I believe taking those steps should do it:
max_encoder_length
to the window length of what you feed to your autoencoder.None
in the init methodstep()
method of the BaseModel and pass as y x["encoder_cont"]y
out of the dataloader only for validating that you can detect anomalies but not for training.Many thanks. This is really interesting. However, why is manual feature engineering required, i.e. why do I manually need to create the sliding windows? I know that in other disciplines, such as NLP, whole documents can be fed in and the network will automatically derive distance-based features using attention (transformer, bert). Are you aware of something similar for time-series?
When training, you need to work with sliding windows (if only for memory reasons). Of course, Your test set can work with an infinite encoder length or prediction length so that the whole time series is parsed in one go.
However, I can think of reasons why this might be problematic in time series forecasting vs NLP:
I have a data set which consists of many multivariate time series (i.e. time series with > 1 value per timestamp originating from many IoT devices).
How can I load such a dataset to pytorch using your https://pytorch-forecasting.readthedocs.io/en/latest/data.html data loader - or do I need to implement my own? I need to ensure that the data is interpreted in the right way to allow the LSTM to learn patterns from an individual time-series / window and include information from multiple devices / time windows in a batch.
I would want to use it for an LSTM-autoencoder to perform anomaly detection.