adambielski / siamese-triplet

Siamese and triplet networks with online pair/triplet mining in PyTorch
BSD 3-Clause "New" or "Revised" License
3.09k stars 634 forks source link

Performance of Triplet on huge number of classes ? #48

Closed BerenLuthien closed 4 years ago

BerenLuthien commented 4 years ago

Hey, Is there any plan to include such complicated dataset as having 10K~100K classes ? Intuitionally MNIST is easy because it has only 10 classes for clustering.

Thanks for sharing the great work BTW :)

adambielski commented 4 years ago

It can be trained on other datasets, it has been done successfully e.g. for faces (FaceNet reference in readme), the main change is a bigger architecture.

BerenLuthien commented 4 years ago

Thanks for response. I knew Facenet was a famous example.

I meant, do you have any plan to include such a larger dataset with more classes into this project ? Facenet was trained on too huge dataset, though. I am looking for some dataset which provides ~10K classes, with ~10 samples per class, that is, about 100K training examples. -- basically something trainable on one GPU.

And I enjoy your Pytorch code, not Google's TF 1.x facenet code :) Besides, Google did not and cannot open source that dataset. Thanks

adambielski commented 4 years ago

I have no plans to include more datasets, the examples were only to show how the approach can be used and what's the intuition behind it.

BerenLuthien commented 4 years ago

QQ: are these embeddings normalized , that is , ||embed||=1 ? Thanks