facebookresearch / multihop_dense_retrieval

Multi-hop dense retrieval for question answering
Other
214 stars 22 forks source link

fix retriever training script #14

Closed ChiaraMC closed 2 years ago

ChiaraMC commented 3 years ago

In train_momentum.py model is an instance of RobertaMomentumRetriever. In a few places the script calls model.module.encoder_q.state_dict() , as well as model.module.queue.clone().detach().t() from mdr.retrieval.criterions.mhop_loss . However, RobertaMomentumRetriever doesn't actually have a module attribute, so the script throws an error. I believe these should be replaced with model.encoder_q.state_dict() and model.queue.clone().detach().t()

ChiaraMC commented 3 years ago

Just realised I had missed one other line that had to be changed in mhop_loss(), I changed it now

xwhan commented 3 years ago

Hi @ChiaraMC, the module attribute once you used dataparrallel. Can you try run the code using multiple gpus.

ChiaraMC commented 3 years ago

Hey @xwhan , yes sorry I only realised once I used it with multiple gpus. This should now work with both one and multiple gpus. However, I think it might look cleaner to use DataParallel even for the case with just 1 gpu? It would save always having to specify if n_gpu > 1 ... I haven't used DataParallel before though so not sure if this would bring other issues