Closed ShivamSharma1997 closed 3 years ago
Noting for reference that using pytorch 1.6 resolves this - https://github.com/salesforce/GeDi/issues/6#issuecomment-738605759 are you using the pytorch docker image mentioned in the readme?
@akhileshgotmare using pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
resolved this, so you should update readme.
`
I am trying to train the GeDi on my own model and got the following error on both my dataset as well as your dataset.
Traceback (most recent call last): File "../train_GeDi.py", line 1103, in
main()
File "../train_GeDi.py", line 1052, in main
global_step, tr_loss = train(args, train_dataset, model, tokenizer)
File "../train_GeDi.py", line 356, in train
loss_b*=loss_mask
RuntimeError: diff_view_meta->outputnr == 0 INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1603729062494/work/torch/csrc/autograd/variable.cpp":363, please report a bug to PyTorch.
I narrowed it down to the problem that I am trying to train the model using multiple GPUs but after setting default GPU as 0, I am still getting the same error. Please help!