facebookresearch / dpr-scale

Scalable training for dense retrieval models.
262 stars 25 forks source link

Error when training the model #10

Closed jwxsp1 closed 1 year ago

jwxsp1 commented 1 year ago

Hi, when I try to train the model, I have an error: image Do you know how to fix it?

ccsasuke commented 1 year ago

Hi @jwxsp1, as the error message mentioned, you're using the slurm trainer, which will produce an error if SLURM is not found. If your cluster does not use slurm, you should change to a different trainer such as gpu_1_host.

jwxsp1 commented 1 year ago

Hi @jwxsp1, as the error message mentioned, you're using the slurm trainer, which will produce an error if SLURM is not found. If your cluster does not use slurm, you should change to a different trainer such as gpu_1_host.

Hello, thank you for your reply. I have tried to replace the trainer, but there seems to be a new problem. How can I solve this problem? I would appreciate it if you would like to answer it. image

ccsasuke commented 1 year ago

@jwxsp1 This line in the default config is supposed to address this error. Are you loading the configs correctly? Our project uses Hydra to manage configs and command-line arguments in case you'd like to learn more about how to change configs such as the trainer.