XifengGuo / DEC-keras

Keras implementation for Deep Embedding Clustering (DEC)
MIT License
477 stars 162 forks source link

Why is there a lot of dataset-specific hyperparameters? #4

Closed ZhaofengWu closed 6 years ago

ZhaofengWu commented 6 years ago

Here: https://github.com/XifengGuo/DEC-keras/blob/fb28f34dc5ad9b88f80e4beeeb1e877561d3f8d8/DEC.py#L296

How should we set those when clustering a new dataset?

XifengGuo commented 6 years ago

@ZhaofengWu It's one of the limitations of DEC algorithm. Empirically, you can set update_interval = 140 for the large dataset, and 20-30 for small ones. But still, you most likely need to tune them by analyzing your results on your new dataset.

ZhaofengWu commented 6 years ago

Understood, but doesn’t that kind of defeat the purpose of being unsupervised?

XifengGuo commented 6 years ago

@ZhaofengWu I suppose so. I believe developing hyperparameter-free or insensitive algorithm is one of directions the deep clustering goes.

ZhaofengWu commented 6 years ago

Thanks!