Implement metric learning for time series

rtavenar commented 7 years ago

It would make sense to have metric learning algos dedicated to time series in tslearn.

A good start could be Garreau et al, 2014, but maybe other methods could make more sense.

smarie commented 4 years ago

If you need help on this topic I happen to have a bit of experience: https://tel.archives-ouvertes.fr/tel-01678889v1/document

In particular several topics from this manuscript could be included in tslearn:

section 2.3 "unimodal metrics for time series". In particular I have a python implementation of 2.3.4 (corT) already, maybe it would be worth including it in tslearn ?
section 3.3.3 "multi-scale description for time series"
section 3 as a whole (metric learning per se)

It is much easier to implement 2.3 (unimodal metrics) and 3.3.3 (multi-scale generation of metrics) first, as the metric learning per se (section 3) is heavy in terms of computation/data representation (pairwise space) and optimization methods.

rtavenar commented 4 years ago

Hi @smarie

Definitely! Your expertise would be a great help to the tslearn team!

Concerning metric learning, it would be nice (I guess, I'm not an expert on the topic, though I'd like to learn :) to have both your method and Garreau's one included in tslearn, don't you think? Do you know of other standard competitors that should be considered? Do you think you could give a hand on integration of one or both methods into tslearn?

Best, Romain

smarie commented 4 years ago

Do you know of other standard competitors that should be considered?

There seem to be three families of approaches to metric learning for time series:

metric learning methods dedicated to timeseries
generic metric learning method that can be customized to handle timeseries specificities
generic metric learning methods that can't be customized to handle timeseries specificities (and are therefore expected to perform less well)

There is also the topic of task: metric learning... for what? Alignment ? Classification ? etc. Some of these tasks are standard sklearn ones but some others are not, we should define them.

It is already old (2017) but you can check the biblio of our subsequent IS journal paper to get a list of methods in both categories. There were many already so I would not be surprised that there are more now.

I'm not familiar with Garreau's method but after looking through it briefly, it seems to belong to the first category.

Our method belongs to the second category: it is generic as it learns an optimal metric that is a linear or non-linear combination of basic metrics. So you can use any set of basic metrics of your choice, not necessarily the ones we propose, and not necessarily metrics for timeseries. In the paper we propose basic metrics that form a multi-modal (amplitude, shape, spectrum), multi-scale set to compare timeseries. But you could use any number of alternate basic metrics instead.

it would be nice (I guess, I'm not an expert on the topic, though I'd like to learn :) to have both your method and Garreau's one included in tslearn. [...] Do you think you could give a hand on integration of one or both methods into tslearn?

Well I would be glad to see our method available in tslearn but I'm afraid it requires quite some bandwidth, which to be honest I do not have for now. Also this discussion made me have some questions about the best place to put each piece.

I am not yet familiar with the scope and maturity / roadmap of tslearn as compared to sktime, skits, statsmodels.tsa, scipy or any of the references that you documented on your great page - I do not know either of them.
I also do not know yet (but discovered thanks to this discussion !) INRIA's metric-learn package, which at least seems to implement Weinbergers' method (a non-ts specific method that inspired our generalized formulation).

At this point I would suggest

to implement the timeseries-specific metrics in tslearn, starting with corT for example (as I already have a code), and adding FFT spectral distances and multi-scale metric generations later. And to make them in a way that they can be plugged if possible in the metric-learn package (to check if there is a standard metric api here, or if their api is 'just' the transformer/classifier/regressor api from sklearn)
to implement our general metric learning approach in metric-learn... when there is implementation bandwidth
to implement Garreau's approach in tslearn but trying to match as possible the framework/structures in metric-learn if they are relevant
to try to see if there are metric learning tasks that are missing (for example the 'alignment' task) and where they should best fit (tslearn, metric-learn, sklearn)

What do you think ? Sorry for opening this in many different directions but your direct question triggered quite a bit of implementation-related thinking :)

rtavenar commented 4 years ago

OK, thank you for the very detailed answer. I agree that tslearn should focus on time series specific methods, and metric-learn is probably a better place for generic metric learning methods.

So, what I suggest is that we focus on Garreau's method for a start on metric learning in tslearn, with the goal of following metric-learn API, of course.

Then, if your method is implemented in metric-learn, a second step would be to add the similarity measure corT in tslearn and see how the global metric learning method could be run using both metric-learn and tslearn, but that seems to be for a later stage.

Anyway, if anyone is willing to work on implementation of Garreau's method, that would be great!

smarie commented 4 years ago

Thanks for the quick answer @rtavenar ! I'll investigate and keep you posted when I have interesting news on this topic.

tslearn-team / tslearn

Implement metric learning for time series #8