Closed x-tabdeveloping closed 10 months ago
Sounds like there might be a reason to use differing loss, but MultipleNegativesRankingLoss is probably a good baseline to go with and then we can do manipulations on that afterwards.
I consider this done
@KennethEnevoldsen gave me the ContrastiveTensionLoss as an example of how one could do in batch-negatives for sampling, but as you can see in this example, Contrastive Tension loss with in batch negatives is used with an unsupervised training objective, so it is probably not what we're looking for.
I think MultipleNegativesRankingLoss is what we're looking for. As per the docs:
Which essentially does the same thing as the ContrastiveParallel task that I wrote, but with in-batch negative examples and the number of negative samples is set.
MultipleNegativesSymmetricRankingLoss could also work quite well, as per the documentation:
I also quite like MegaBatchMarginLoss :