shahsohil / DCC

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper
MIT License
208 stars 53 forks source link

Too many clusters after DCC #17

Closed daihengming closed 5 years ago

daihengming commented 5 years ago

Hey @shahsohil ,I'm trying to apply DCC to my own image dataset, and the results have too many clusters(1000 clusters for 3000 images), I'm wondering is there any way I can try to force the algorithm produce less clusters(for 5-10)? Could you please give me some suggestions? Thanks.

shahsohil commented 5 years ago

Hi @daihengming,

Some checks:

  1. The input data must be normalized. Refer make_data.py for the same. Also similar normalization must be applied for ceating underlying graph. Refer edge_construction.py code from RCC database.
  2. It still not resolved, then try increasing the 'k' for mkNN during the graph construction stage.

Based on outcome of above steps I will suggest others.