Q1: What does the description below from README mean specifically?
train_group_size: the number of positive and negatives for a query in training. There are always one positive, so this argument will control the number of negatives (#negatives=train_group_size-1). Noted that the number of negatives should not be larger than the numbers of negatives in data "neg":List[str]. Besides the negatives in this group, the in-batch negatives also will be used in fine-tuning.
e.g.
How will the negatives been choosen from "neg":List[str]?
How will the negatives been choosen from in-batch negatives?
What are the specific strategies here?
Q2: How does train_group_size affect the finetuning performance and how to choose the optimal value?
We will random sample train_group_size-1 negatives from "neg":List[str]
All passages in the same batch (except the positive) will be used as negatives
For example, a batch is [[q, p, n1, n2], [q', p', n1', n2']], the negatives used to compute loss including: n1, n2, p', n1', and n2'
Q2:
A larger train_group_size usually improves the performance.
Q1: What does the description below from README mean specifically?
Q2: How does train_group_size affect the finetuning performance and how to choose the optimal value?