facebookresearch / DPR

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Other
1.73k stars 304 forks source link

Doubt regarding all_gather_list in case of DDP #230

Open bhattg opened 2 years ago

bhattg commented 2 years ago

Hi,

Thanks for the amazing framework. I have a doubt regarding the utility of the all_gather_list function, that gathers the tensors across the GPUs. When we are training in DDP, the gradients are synchronized before the parameter updates, therefore, why is this step needed? Is it just to collate the loss or number of correct predictions or the rank (in evaluation)? If yes, then couldn't one gather all of them after computing the loss, instead of exchanging the question and context representations first and then going forward with it?

Thanks!