bdy9527 / SDCN

Structural Deep Clustering Network
Apache License 2.0
262 stars 73 forks source link

Lack of description about dataset for training #3

Closed rose-jinyang closed 4 years ago

rose-jinyang commented 4 years ago

Hello Thanks for contributing this paper and code. But I feel the lack of description about dataset structure. I am going to apply this to face clustering. Is it possible for 5M face clustering too? Could u explain it in detail?

bdy9527 commented 4 years ago

Thanks for the attentions.

In general, if you want to apply our model to other datasets, two steps are required. First, constructing the KNN graph based on the similarities of samples. Details can be found in calcu_graph.py The KNN graph is stored as the form of edges. Second, pretraining the autoencoder and save the pre-trained model. Details can be found in data/pretrain.py The pretrain model is stored as the form of pkl. Finally, replace the args in sdcn.py and run the code. Hope this can help you.

rose-jinyang commented 4 years ago

Thanks for your quick reply. I hope that the above description will be added in README. I have one more question. Did u review CDP or LTC(GCN-D & GCN-S) methods based on supervised learning for large scale face clustering? How about comparing SDCN with these methods? Please let me know asap. Thanks.

bdy9527 commented 4 years ago

Thanks for the advice.

Actually, I'm not familiar with large scale face clustering and SDCN does not use supervised information. Therefore, the performance may not meet your expectations.

Besides, a suitable encoder is important for deep clustering. For the sake of generality, we use the basic autoencoder to learn representations for different types of data (i.e., image, text, attributes). If you focus on face clustering, the pre-trained CNNs maybe more helpful.

rose-jinyang commented 4 years ago

Thanks for your reply.

lucy3589 commented 3 years ago

Hello Thanks for contributing this paper and code. But I feel the lack of description about dataset structure. I am going to apply this to face clustering. Is it possible for 5M face clustering too? Could u explain it in detail? this code need set cluster_number?

lucy3589 commented 3 years ago

Thanks for the advice.

Actually, I'm not familiar with large scale face clustering and SDCN does not use supervised information. Therefore, the performance may not meet your expectations.

Besides, a suitable encoder is important for deep clustering. For the sake of generality, we use the basic autoencoder to learn representations for different types of data (i.e., image, text, attributes). If you focus on face clustering, the pre-trained CNNs maybe more helpful.

this code need set cluster_number?