tslearn-team / tslearn

The machine learning toolkit for time series analysis in Python
https://tslearn.readthedocs.io
BSD 2-Clause "Simplified" License
2.89k stars 337 forks source link

[Help request] Interpretation of KShape-clustering output #362

Open toshaklg opened 3 years ago

toshaklg commented 3 years ago

Hello there!

I am using tslearn for clustering some some NDVI data I've got and I have a problem with results interpretation after KShape clustering.

In your docs there is a quite nice example that I followed. Basically, the NDVI data I have is far from ideal, but it is in time-series format, each entry has the same length and values lay between 0 and 1. Under such conditions, KMeans with DTW works somewhat decent and results resemble what I have:

vcxsrv_nRWJNoK0YK

However, in case with KShape feels like the data is messed up after applying

ndvi_input_scaled = TimeSeriesScalerMeanVariance().fit_transform(ndvi_input)

and instead of nice lines that I get in KMeans, and you have in the example, I am getting some bizzare -2.5 values that jump like crazy over the period and they do not look even remotely similar to what I have as an input.

vcxsrv_l8qHCm0Cdx

So, am I missing something or the data is bad? Any suggestions, maybe?

Thanks!

kushalkolar commented 3 years ago

TimeSeriesScalerMeanVariance().fit_transform() will transform your data into standard deviation units, that's what you're seeing