Hello, I am a Korean student who loves the work of sentence-transformers team.
I've used a lot of losses you made when finetuning an embedding model.
However, I thought that it will be nice if there is a loss that gives hard_negative a higher weight in loss calculation.
For example, the input will be like: (anchor, positive, hard_negative) triplets,
and when caculating loss, utilizing
lambda (similarity between anchor-hard_negative) + (1-lambda) (similarity between anchor-in_batch_negatives)
looks pretty good for me.
Because the input consists of (anchor, positive, hard_negative),
in-batch negatives will include both positives and hard negatives from same batch.
Can I hear your thoughts about this idea?
And also, it would be grateful if you let me know if there are some losses similar to this idea that I missed.
Hello, I am a Korean student who loves the work of sentence-transformers team. I've used a lot of losses you made when finetuning an embedding model.
However, I thought that it will be nice if there is a loss that gives hard_negative a higher weight in loss calculation. For example, the input will be like: (anchor, positive, hard_negative) triplets, and when caculating loss, utilizing lambda (similarity between anchor-hard_negative) + (1-lambda) (similarity between anchor-in_batch_negatives) looks pretty good for me.
Because the input consists of (anchor, positive, hard_negative), in-batch negatives will include both positives and hard negatives from same batch.
Can I hear your thoughts about this idea? And also, it would be grateful if you let me know if there are some losses similar to this idea that I missed.
Thank you for reading.