amazon-science / sccl

Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021
MIT No Attribution
289 stars 57 forks source link

Is the model effective? #4

Open yanhan19940405 opened 3 years ago

yanhan19940405 commented 3 years ago

I used the SCCL model in the opinion clustering scenario. After spending 9 days of work to theoretically reproduce the paper,According to the default parameters in the code, I found that the loss curve dropped perfectly, but the change was not large.(There are 8000 original data and 16000 data enhancement) image image

But when I ran the whole process according to the parameters given in the paper, I found that the clustering effect was very bad. At the same time, the sentence embedding matrix was generated for all samples, and the visualization found that the sample spatial distribution was uneven.Moreover, a large number of samples overlap each other and cannot be separated from each other according to the clustering measurement rules.

The visualization is as follows:(Each point in the figure represents a sentence sample) image

So, please,who has achieved better results with SCCL clustering method? Is there something wrong with me?

Dejiao2018 commented 3 years ago

The clustering loss decreases to very small values from the very beginning. Two possible reasons here 1) the clustering assignment is entirely random, your model does not learn anything; 2) your code assign every data sample to an unique cluster. So before you question the effectiveness of our model, I do encourage you to check your code or use our code instead, and see whether 1) or 2) happened in your case. A simple check would be plot the entropy and conditional entropy of q (cluster assignment probability)

As for your plot, can you provide more context? Is it a TSNE plot? If so, should you have multiple colors and each associates with a ground truth cluster?

We use the same set of hyperparameters for the eight datasets reported in our paper to demonstrate the effectiveness of SCCL. However, if your data statistics are too different from ours, you need to find the proper hyperparameters.

Thanks

yanhan19940405 commented 3 years ago

Well, thank you. The learning rate in your code is different from that described in the paper. Besides, did you train with batch set to 400 and step set to 3000?

---Original--- From: @.> Date: Fri, Jul 30, 2021 23:05 PM To: @.>; Cc: @.**@.>; Subject: Re: [amazon-research/sccl] Is the model effective? (#4)

The clustering loss decreases to very small values from the very beginning. Two possible reasons here 1) the clustering assignment is entirely random, your model does not learn anything; 2) your code assign every data sample to an unique cluster. So before you question the effectiveness of our model, I do encourage you to check your code or use our code instead, and see whether 1) or 2) happened in your case. A simple check would be plot the entropy and conditional entropy of q (cluster assignment probability)

As for your plot, can you provide more context? Is it a TSNE plot? If so, should you have multiple colors and each associates with a ground truth cluster?

We use the same set of hyperparameters for the eight datasets reported in our paper to demonstrate the effectiveness of SCCL. However, if your data statistics are too different from ours, you need to find the proper hyperparameters.

Thanks

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Dejiao2018 commented 2 years ago

Well, thank you. The learning rate in your code is different from that described in the paper. Besides, did you train with batch set to 400 and step set to 3000? ---Original--- From: @.> Date: Fri, Jul 30, 2021 23:05 PM To: @.>; Cc: @.**@.>; Subject: Re: [amazon-research/sccl] Is the model effective? (#4) The clustering loss decreases to very small values from the very beginning. Two possible reasons here 1) the clustering assignment is entirely random, your model does not learn anything; 2) your code assign every data sample to an unique cluster. So before you question the effectiveness of our model, I do encourage you to check your code or use our code instead, and see whether 1) or 2) happened in your case. A simple check would be plot the entropy and conditional entropy of q (cluster assignment probability) As for your plot, can you provide more context? Is it a TSNE plot? If so, should you have multiple colors and each associates with a ground truth cluster? We use the same set of hyperparameters for the eight datasets reported in our paper to demonstrate the effectiveness of SCCL. However, if your data statistics are too different from ours, you need to find the proper hyperparameters. Thanks — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

3000 is the maximal steps. The training should stop when the losses stop decreasing or the prediction stays roughly the same.