tslearn-team / tslearn

The machine learning toolkit for time series analysis in Python
https://tslearn.readthedocs.io
BSD 2-Clause "Simplified" License
2.89k stars 336 forks source link

Questions about LearningShapelets implementation #258

Open tcrasset opened 4 years ago

tcrasset commented 4 years ago

Hello !

Using : tslearn==0.3.1, keras==2.2.4 and tensorflow==1.10.

I'm doing my Master's thesis on time series classification and I successfully used your implementation of the Learning Shapelets Classifier to classify multi-dimensional time series.

In my report, I'm trying to explain the different layers of the architecture, however the documentation is not very helpful and the original paper by Grabocka et al. does not go into detail.

Looking through the code and using the keras.utils.plot_model() method (see graph below), I was able to gather the following information:

Are my conclusions correct so far ?

The questions I am having are the following :

I apologize in advance if my questions do not make sense, I don't quite grasp all the details in deep learning. Thank you very much for your time and your library,

Cheers, Tom

I have 4 shapelets of length 5 each, and a time series of length 59. (I have 19 dimensions but I edited the graph to make it clearer) image

rtavenar commented 4 years ago

Hi @tcrasset

There are several (good) subquestions in your question, I think.

Regarding the model itself, it is a simple model that computes local distances between subseries and shapelets and then aggregate these local distances to retain, for each shapelet, the minimum distance. This representation then feeds a fully connected layer.

Concerning our implementation, it is far from optimal, and I dealt with different channels through different (parallel) layers just because I was lazy at the time. A better way to do would be to deal with all channels at once, and it should not be too difficult to implement. Another thing is that we have a fake convolutional layer (with fixed weights) that just extracts subseries from the input so that the subsequent layer can compute distances between these subseries and the shapelets.

Anyway, if you want to see where shapelet coefficients are, you should look at those lines:

https://github.com/tslearn-team/tslearn/blob/75cd661faaeef62d899d26a26d027defc1ffae04/tslearn/shapelets.py#L365-L380

Finally, the reason why we aggregate through Add layers is that the squared distance between multidimensional subseries is the sum of distances along each channel. But once again, this implementation is far from great.

Hope this helps, Romain

PS: by the way, if anyone wants to refactor the shapelet code, that would be great (would probably make the models faster) and I'd be glad to help by reviewing the code.

tcrasset commented 4 years ago

(Sorry for closing, missclicked)

Thank you very much for your fast reply!

This line made everything click for me:

Finally, the reason why we aggregate through Add layers is that the squared distance between multidimensional subseries is the sum of distances along each channel. But once again, this implementation is far from great.

Anyway, if you want to see where shapelet coefficients are, you should look at those lines:

What you are saying is that the weights of the shapelets_%d_%d layer represent the shapelets?

With regard to refactoring the shapelet code, I am not up for the challenge. However, I implemented SCRIMP++ (matrix profile), time series snippets and MPdist in Java, so if that is something you are interested in, I'll be glad to help port my implementation to this library.

Have a good day, Tom

rtavenar commented 4 years ago

What you are saying is that the weights of the shapelets_%d_%d layer represent the shapelets?

Yep, and first index is the shapelet id, second index is the channel id.

With regard to refactoring the shapelet code, I am not up for the challenge. However, I implemented SCRIMP++ (matrix profile), time series snippets and MPdist in Java, so if that is something you are interested in, I'll be glad to help port my implementation to this library.

That would be great! Could you maybe open a new Issue to detail what you could offer, and what these algorithms bring to the matrix profile ecosystem (sorry, I'm not an expert of MPs)?

rtavenar commented 4 years ago

With regard to refactoring the shapelet code, I am not up for the challenge. However, I implemented SCRIMP++ (matrix profile), time series snippets and MPdist in Java, so if that is something you are interested in, I'll be glad to help port my implementation to this library.

That would be great! Could you maybe open a new Issue to detail what you could offer, and what these algorithms bring to the matrix profile ecosystem (sorry, I'm not an expert of MPs)?

@tcrasset Feel free to join the discussion in #260 if you will

GillesVandewiele commented 4 years ago

Closing this issue as questions seem to be answered. Please re-open if you have any more questions!

rtavenar commented 4 years ago

Let me reopen this one as a reminder that LS implementation should be made simpler by treating all modalities at once instead of using different parallel blocs.