Enhance Avalanche Library to Support Time Series Data and Models

ContinualAI / avalanche

Avalanche: an End-to-End Library for Continual Learning based on PyTorch.

http://avalanche.continualai.org

MIT License

1.78k stars 290 forks source link

Enhance Avalanche Library to Support Time Series Data and Models #1490

Open irosyadi opened 1 year ago

irosyadi commented 1 year ago

Background Motivation: Currently, Avalanche library lacks comprehensive support for time series data, which has become increasingly popular in industrial settings and language models.

To address this gap, I propose the inclusion of time series models like simple 1D CNN or LSTM, and time series datasets such as ECG or Physionet Challenge into Avalanche. This enhancement would greatly benefit users who require continual learning capabilities for time series tasks, offering a more versatile and valuable toolset.

Thank you for considering this feature request.

AntonioCarta commented 1 year ago

Thanks for the suggestion. I see there is a growing interest in the area. Do you have some benchmarks in mind that are already used in the literature? Would you be available to prepare a pull request to add them?

irosyadi commented 1 year ago

The most frequently used benchmark in time series classification/regression is the UCR datasets, which consists of 128 time-series datasets. Both sktime (a Tensorflow-based Python package) and tsai (a Pytorch-based Python package) have provided comprehensive solutions for data handling and model development, including the SoTA.

Research regarding continual learning with time-series data using Avalanche has already been conducted. The research, titled "Continual Learning for Human State Monitoring", uses the WESAD dataset which is designed for wearable stress and affect detection. The coding information can be found at the following GitHub link: fexed/CLforHSM.

This article, "Continual Deep Learning for Time Series Modeling", provides a review of the ongoing development of CL for time series data.

AntonioCarta commented 1 year ago

@AndreaCossu is an author of the CLforHSM paper. @AndreaCossu what do you think? Bringing the time series data in avalanche may require additional dependencies. Maybe a middle ground solution could be providing a page in the documentation where we list papers using Avalanche categorized by their applications? This way, people can see that different use cases but we don't have to centralize everything in Avalanche, which may be expensive to maintain.

AndreaCossu commented 1 year ago

The inclusion of custom models to deal with temporal data is not necessary, since any PyTorch model works in Avalanche as well. We already provide a very simple sequence classifier, but other models can be easily created.
We have an example on how to implement a continual learning pipeline for sequence classification here
I agree that including some benchmarks for time series would be helpful. However, I don't think adding a popular time series data would help much. Instead, it would be very nice to include time series benchmarks that are used in continual learning. I would be happy to review a PR about this. As for the dependencies, if they are widely used/well maintained I think we could consider adding 1 or 2 (optional).