Closed Albert-Ma closed 2 years ago
A few things,
--negatives_x_device
set.Moreover, the performance is roughly the same as what we have in the Condenser paper, with BM25 negatives. You'd need a round of hard negative mining to get better performance. The best performance also would require you to replace Condenser initializer with a coCondenser initializer.
A few things,
- Technically we don't officially support DP with Pytorch but only DDP. You may run into undefined behaviors.
- With DDP, you should have the
--negatives_x_device
set.- What you are currently having is effectively a batch size of 2 (gpu) x 32 (qry) x 8 (psg) = 512
Moreover, the performance is roughly the same as what we have in the Condenser paper, with BM25 negatives. You'd need a round of hard negative mining to get better performance. The best performance also would require you to replace Condenser initializer with a coCondenser initializer.
Thanks for your reply! I used DDP to run the command and forgot to set the --negatives_x_device
.
Hi, wonderful work on this toolkit! I really like it!
Following the README here, I use the following command to train the retriever with Condenser on 2 GPUS which results in the total batch size of 64, the same setting as in the paper:
The result I got is 0.331:
##################### MRR @ 10: 0.3308558466366481 QueriesRanked: 6980 #####################
Is there any parameter I missed to set? Thanks!