According to the paper, GritLM uses in-batch negative as negative samples for contrastive learning. But in the toy embedding dataset, the json contains the key "neg", which cannot be removed. Thus I don't understand why we need some more negative samples when we already have in-batch negatives and how they are constructed.(From the toy dataset I don't think they are hard negatives of the query sentence.)
These are hard negatives; it uses both in-batch negatives & hard negatives
You can make it not use the hard negatives by setting train group size to 1 I think
According to the paper, GritLM uses in-batch negative as negative samples for contrastive learning. But in the toy embedding dataset, the json contains the key "neg", which cannot be removed. Thus I don't understand why we need some more negative samples when we already have in-batch negatives and how they are constructed.(From the toy dataset I don't think they are hard negatives of the query sentence.)