Closed rmitsch closed 6 years ago
tslearn offers a few clustering algorithms focusing on time series data. Might be worth a look.
I'd suggest soft-DTW k-means. It's included in tslearn's clustering module (see here for an example illustrating clustering with tslearn, it's pretty much identical to sklearn's idiom).
After reading up a bit on common approaches to cluster time series data there seem to be two main common directions:
For reference: See e. g. here, here, and here.
I'd suggest we try both approaches and compare the results against the baseline clusters defined in #31 to see which one yields better results.
HDBSCAN and PreDeCon are now working in the pipe: The interface works like that:
from models.cluster import ElkiPipe
import pandas as pd
data = pd.read_csv(data_dir, sep=";")
elki = ElkiPipe()
# for predecon
params = elki.get_parameters_for_predecon(param_eps = 10.0, param_minpts = 2,
param_delta = 0.1, param_lambda = 1,
param_kappa = 20.0)
# for hdbscan
params = elki.get_parameters_for_hdbscan(param_minpts=100)
results = elki.run_elki(data, params, plot_path=FLAGS.vis_path)
merged
...or evaluate several. Depends on #27.