Closed zycleo closed 2 years ago
Dear author, I wonder if you got the prototype, did you first use a pretrained model to extract visual and textual features, then concatenate the visual and textual features, cluster 20 clusters with k-means? Normalize before clustering? Use each cluster center as a prototype? Could you please explain it in detail?Thank you very much.
Hi, thanks for your interest. Yes, the details are mentioned in our paper which are the same as you said.
Dear author, I wonder if you got the prototype, did you first use a pretrained model to extract visual and textual features, then concatenate the visual and textual features, cluster 20 clusters with k-means? Normalize before clustering? Use each cluster center as a prototype? Could you please explain it in detail?Thank you very much.