FlorentF9 / DeepTemporalClustering

:chart_with_upwards_trend: Keras implementation of the Deep Temporal Clustering (DTC) model
MIT License
220 stars 58 forks source link

Agglomerative Clustering without n_clusters #3

Closed han-so1omon closed 4 years ago

han-so1omon commented 4 years ago

I am testing this out for a music-similarity dataset, which does not have a defined number of clusters. Would your DTC library work the same for use with Agglomerative Clustering where {n_clusters=None, distance_threshold=d, compute_full_tree=True}?

It would seem that TSClusteringLayer and heatmap generation require n_clusters.

FlorentF9 commented 4 years ago

DTC is a deep clustering method, meaning that it aims at jointly optimizing the representation of data (via the autoencoder) and the clustering. Optimization is done with gradient descent (SGD) as usual in neural nets. If you look at the loss function, it is a combination of the autoencoder MSE and a KL-divergence clustering loss. For this reason, we need a clustering algorithm with parameters that can be optimized either by gradient descent (here we use a soft center-based clustering, similar to k-means but with a differentiable KL-divergence loss function), or use alternating optimization of the AE and the clustering (i.e. update only one parameter at a time).

Either way, I don't see how it could be used with agglomerative clustering because it has no straightforward loss function and cannot be optimized with SGD.

BUT you can of course use only the ConvLSTM autoencoder to first encode your data (using only reconstruction loss), and then apply agglomerative clustering on the latent representations, using any distance metric you like.

Concerning the heatmap, it is based on a supervised classification network so it needs to know the number of classes.