deepset-ai / haystack-tutorials

Here you can find all the Tutorials for Haystack 📓
https://haystack.deepset.ai/tutorials
Apache License 2.0
227 stars 80 forks source link

Error when training DPR model on my own dataset (Tutorial 9) #309

Closed ranieristyaa closed 3 months ago

ranieristyaa commented 3 months ago

Describe the issue i am about to fine tune a DPR model on my own dataset. i could run the training process before with no error, last time i run it was like 1 week ago. but now when i am trying to run the training again with same data, same code, and same environment it keeps getting error like this: image

To Reproduce here is my colab code: https://colab.research.google.com/drive/1bKR4cNkxQwJhmm_gXfhdgHNmKIsgvu-R?usp=sharing and the data i am using: answersDPR.json

Expected behavior the code supposed to run correctly like this: image

and the model should fine-tuned succesfully.

What environment did you try to run the tutorial on?:

anakin87 commented 3 months ago

Probably something has changed in the latest versions of PyTorch.

I managed to fix the error with the following commands:

import torch.distributed as dist
import os

os.environ['MASTER_ADDR'] = '127.0.0.1'
os.environ['MASTER_PORT'] = '29500'

dist.init_process_group("gloo", rank=0, world_size=1)

More information in the PyTorch docs: here and here.

ranieristyaa commented 3 months ago

it is fixed, thank you