Closed pmorerio closed 4 years ago
Hi, thank you for your question.
Unfortunately, I did not conduct an ablation study on this component with the MMT framework.
But when conducting experiments on baseline models, i.e. training with only hard pseudo labels, I found that cf = (cf_1+cf_2)/2
performed better than simply taking cf_1
or cf_2
. I did not try concatenation since this operation will increase the feature dimension, resulting in more computation cost in clustering.
Thank you very much for the quick answer!
Did you try averaging features also for testing the models?
No, I only adopt one model in the inference, i.e. features for testing are output by one model directly.
https://github.com/yxgeee/MMT/blob/aeb547079c65d9aa2b8ce08d587970412d362b07/examples/mmt_train_kmeans.py#L156
Hi, is there a reason why you perform clustering on averaged features? Did you find empirically that this works better than other options like concatenation or simply taking either
cf_1
orcf_2
? Thanks for your time and again thanks for sharing the code.