Explore time-series datasets/models to identify possible benchmark

cpranav93 commented 2 years ago

The dataset has to be simple but with scientific appeal. Look into existing datasets and see if we can simplify an existing dataset.

Datasets available at: http://timeseriesclassification.com/dataset.php

cpranav93 commented 2 years ago

Look for datasets and if possible associated trained models for benchmarking.

If model can be found in ONNX format, that would be ideal.

geek-yang commented 2 years ago

Almost forgot! There is a timeseries dataset created in our center (mainly by Florian and Dafne) for deep learning training and teaching purposes. The dataset contains intuitively accessible weather observations from 18 locations in Europe. This dataset is designed to be used for the following tasks:

classification
regression
forecasting

And it is complex enough to demonstrate realistic issues such as overfitting and unbalanced data, while still remaining intuitively accessible.

Here is the zenodo link to this dataset. The github repo is also available here.

For a quick overview of this dataset, check this video, which is the presentation given by Florian during the workshop.

More information can be found in the paper. It is not yet available online (but already got accepted by ECML workshop). As one of the co-authors, I had the final camera-ready version of this paper. I put it in the sharepoint of our team and here is the link (please don't share with others).

elboyran commented 2 years ago

Hi @geek-yang , that's so cool! I love it. There was for me a problem with the sync between audio and images in the video, I had the idea the second half of the audio got missing at the end... Will have a look at the paper next.

For me the question is: can we use only this or better (?) this can be our counterpart for the image Leafsnap30, aka the simple scientific one for time series in addition to something like binary MNIST/ triangles and circles aka simple geometric for the images.

cpranav93 commented 2 years ago

We can use the timeseries dataset created by the intro to deep learning team as one example. We should continue to look for another dataset that could complement this.

geek-yang commented 2 years ago

Hi @geek-yang , that's so cool! I love it. There was for me a problem with the sync between audio and images in the video, I had the idea the second half of the audio got missing at the end... Will have a look at the paper next.

For me the question is: can we use only this or better (?) this can be our counterpart for the image Leafsnap30, aka the simple scientific one for time series in addition to something like binary MNIST/ triangles and circles aka simple geometric for the images.

Following our discussion, I think we can have two datasets, a relatively complex multivariate timeseries and a simple univariate timeseries. We can take this weather dataset as the complex one. And for the simpler one, we can use the coffee dataset, which is proposed by @cpranav93.

geek-yang commented 2 years ago

To summarize, here I would like to propose two datasets for the study and demo of XAI approaches for timeseries:

Weather prediction dataset
- Multi-variate timeseries dataset
- Designed for tasks: classification and regression
- Link to the dataset: zenodo entry
- Examples of model training recipe with this dataset: classification and regression
Coffee dataset
- Univariate timeseries dataset
- Designed for tasks: classification
- Link to the dataset: timeseries classification datasets
- Examples of model training recipe with this dataset: classification

Similar to the choices we made earlier for images, these two timeseries datasets are relatively easy to understand and there are existing examples of training machine learning models based on these two datasets. We can further explore these two datasets in next step.

cwmeijer commented 2 years ago

I like the weather prediction dataset a lot for a whole bunch of reasons. Let's use it!

I'm not sure about the coffee dataset. It has some obvious good points like the fact that it is a binary problem, it is univariate, instances are short and therefore visualizable, different domain than the weather dataset, etc. However, as an example of a time series data set it is not ideal that it is in fact not time series data. We could argue that this showcases the flexibility of the tool, but as an example I think it would be nicer to have actual time series data.

Alternatively we could go with an earthquake prediction dataset.

Earthquake prediction dataset
- Univariate timeseries dataset
- Designed for tasks: classification
- Link to the dataset: timeseries classification datasets
- Examples of model training recipe with this dataset: I couldn't find any but the description says 75% accuracy is possible.

It has a couple of hundreds of instances of 512 time steps each and is less than 1 Mb. A downside of this dataset is that the example instance looks a bit homogenic (see image below) to me so it would be less interesting to highlight certain parts in terms of saliency. The coffee dataset was maybe nicer in that regard. Still, I'd prefer the earthquake dataset I think.

On reviewing and closing this issue: I'm fine with both the weather and coffee datasets (or weather and earthquake).

elboyran commented 2 years ago

@cwmeijer , ha, I'm raised by a Dad seismologist and he always said earthquakes cannot be predicted (reasonably precicsely ahead time-wise) and I always felt callenged ;)

cwmeijer commented 2 years ago

I would like to say that all has changed after 2012 haha! Maybe he is still kind of right though. The dataset description says they had some model (which is not shared) that is 75% accurate. I'm not even sure how the classes are balanced, but even if they are exactly 50-50, 75% accuracy is probably not enough to be significant in practice. It's hard to take any meaningful actions based on that.

More on topic: 75% accuracy is maybe not enough to yield understandable saliency maps. Maybe the coffee dataset still is our best bet.

Alternatively we could also go with PAMAP2 (see #365). It's a much larger dataset, but we trained something with 96% accuracy over 7 classes on it it seems. See bottom of https://github.com/NLeSC/mcfly-tutorial/blob/master/notebooks/tutorial/tutorial.ipynb. It's multivariate, accelerometer data.

cwmeijer commented 2 years ago

After a short discussion during the standup, we decided to just go for the Coffee and Weather datasets as suggested by Yang.

dianna-ai / dianna

Explore time-series datasets/models to identify possible benchmark #364