asardaes / dtwclust

R Package for Time Series Clustering Along with Optimizations for DTW
https://cran.r-project.org/package=dtwclust
GNU General Public License v3.0
252 stars 29 forks source link

Looking for advice: how to cluster 150k time series? #52

Closed hummuscience closed 1 month ago

hummuscience commented 3 years ago

Hey!

I have used your package successfully with a small number of time-series (a few 100) and I would like to scale it now to my whole dataset (150k time series).

I have tried to run it with 15k time series, but no setting I tried seems to work (it gets stuck). Or am I missing something?

What would be the best way to approach this?

asardaes commented 3 years ago

Hello, you don't specify exactly which settings you're talking about. Processing so many series can indeed be a challenge, especially in R. Partitional clustering without PAM could maybe work, not sure.

shuaiwang88 commented 3 years ago

Hey!

I have used your package successfully with a small number of time-series (a few 100) and I would like to scale it now to my whole dataset (150k time series).

I have tried to run it with 15k time series, but no setting I tried seems to work (it gets stuck). Or am I missing something?

What would be the best way to approach this?

Did you find anything now? How many clusters do you expect for 15k time series?

shuaiwang88 commented 3 years ago

Hello, you don't specify exactly which settings you're talking about. Processing so many series can indeed be a challenge, especially in R. Partitional clustering without PAM could maybe work, not sure.

What's the correct arguments to begin with? I have 75k products with 3 years weekly sales data, I want to have 200 clusters or so.

lsemployeeoftheyear commented 1 year ago

Forgive me if I'm pirating this a bit but is anyone aware of any kind of research or studies done that might suggest how much data you actually need to train a clean cluster model? There's usually a decreasing return in investment after a certain point, if I'm not mistaken...