luyug / GC-DPR

Train Dense Passage Retriever (DPR) with a single GPU
Other
128 stars 20 forks source link

Multiply by distributed_factor/8. #7

Open wavy-jung opened 2 years ago

wavy-jung commented 2 years ago

Thanks for posting a really nice repo! While I was studying the code, I found that in 'train_dense_encoder.py' line 669 and 691 the following: ''' surrogate = surrogate * (trainer.distributed_factor / 8.) ''' which I actually don't fully understand the reason of the multiplication part. Can you explain any reason? Thank you 👍

luyug commented 2 years ago

Take a look at https://github.com/luyug/GC-DPR/issues/4 to see if it help answer the question.

wavy-jung commented 2 years ago

It helps! Thanks a lot :)