Open zandarina1 opened 1 year ago
Hello @zandarina1,
I think that your problem comes from the misuse of numpy.ravel
which flattens the NumPy arrays:
https://numpy.org/doc/stable/reference/generated/numpy.ravel.html#numpy.ravel
Taking inspiration from: https://tslearn.readthedocs.io/en/stable/auto_examples/clustering/plot_kmeans.html#sphx-glr-auto-examples-clustering-plot-kmeans-py I have written the following code:
import numpy
import matplotlib.pyplot as plt
import numpy as np
from tslearn.clustering import TimeSeriesKMeans
from tslearn.datasets import CachedDatasets
from tslearn.preprocessing import TimeSeriesScalerMeanVariance, \
TimeSeriesResampler
seed = 0
numpy.random.seed(seed)
X_train, y_train, X_test, y_test = CachedDatasets().load_dataset("Trace")
print(X_train.shape) # (100, 275, 1)
X_train = np.concatenate([X_train, - X_train], axis=2)
print(X_train.shape) # (100, 275, 2)
X_train = X_train[y_train < 4] # Keep first 3 classes
numpy.random.shuffle(X_train)
# Keep only 50 time series
X_train = TimeSeriesScalerMeanVariance().fit_transform(X_train[:50])
# Make time series shorter
X_train = TimeSeriesResampler(sz=40).fit_transform(X_train)
sz = X_train.shape[1]
print(sz)
# Soft-DTW-k-means
print("Soft-DTW k-means")
sdtw_km = TimeSeriesKMeans(n_clusters=3,
metric="softdtw",
metric_params={"gamma": .01},
verbose=True,
random_state=seed)
y_pred = sdtw_km.fit_predict(X_train)
for yi in range(3):
for di in range(2):
plt.subplot(2, 3, 1 + yi + 3 * di)
for xx in X_train[y_pred == yi]:
plt.plot(xx[:, di], "k-", alpha=.2)
plt.plot(sdtw_km.cluster_centers_[yi, :, di], "r-")
plt.xlim(0, sz)
plt.ylim(-4, 4)
plt.text(0.05, 0.85, f"Cluster {yi + 1}, dim {di + 1}",
transform=plt.gca().transAxes)
if yi == 1 and di == 0:
plt.title("Soft-DTW $k$-means")
plt.tight_layout()
plt.show()
Does it correspond to what you would like to do?
Hello all,
I want to use two dimensions, two time series for each participant. I transform the data as expected by the library
(6431, 5, 2)
However if I plot it, it puts together both signals in one single plot, I am not sure if they are considering the features separatelly, that this is what i want for example participant 1 with series A increasing and series B dicreaseing is cluster 1. But what I get, it does not make sense, it makes the same as if it was in one dimension and if I plot it, it does not make sense by separating X_train[y_pred == yi,:,1] or X_train[y_pred == yi,:,0], and the cluster centers are the same for both series /dims. How can I plot when I have two dimensions and make the clusters differentiate by dimensions?. It would be great to show an example with multiple dimensions apart from the nice examples of the tutorial, Thanks