kushalj001 / pytorch-question-answering

Important paper implementations for Question Answering using PyTorch
MIT License
274 stars 50 forks source link

RuntimeError: mat1 dim 1 must match mat2 dim 0 #5

Open sathsaraRasantha opened 3 years ago

sathsaraRasantha commented 3 years ago

First of all, you have done a fantastic work here and I would like to thank you for that. I am trying to use your implementation on a different QA dataset ( Translated version of SQuAD 1.0 to Sinhala language ). The only changes I made was using different dataset ( But in same format ), using Google colab and using FastText word embeddings instead of Glove. I am getting a error when trying to call the train function. I didn't make any changes to "class BiDAF" or train function. This is the error I am getting.

Starting training ........ Starting batch: 0

RuntimeError Traceback (most recent call last)

in () ----> 1 train(model, train_dataset) 9 frames /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias) 1690 ret = torch.addmm(bias, input, weight.t()) 1691 else: -> 1692 output = input.matmul(weight.t()) 1693 if bias is not None: 1694 output += bias RuntimeError: mat1 dim 1 must match mat2 dim 0 Can you take a look at this whenever you have a free time. It would be great if you can help me with this.
kushalj001 commented 3 years ago

Hi @sathsaraRasantha. I have not been able to look at your issue due to other commitments. I'll try to take a look this weekend. It would be better if you could share your notebook as colab (or any other way), so that I can debug it easily. Also, I would suggest you to take a look at the shapes of your tensors at each step since that's what going wrong for you somewhere.

sathsaraRasantha commented 3 years ago

Hi @kushalj001. Thank you so much for replying. I solved that issue by looking at shapes of tensors as you also mentioned. But now there is a different error. I'll share the link of colab notebook. I am sure you would be able to debug it easily. I made some changes in data preprocessing steps as well. I just get the feeling that the error is something related to that.

This is the error I am getting ..................................................................................................................................................................................................................................................................... Starting training ........ Starting batch: 0

IndexError Traceback (most recent call last)

in () ----> 1 train(model, train_dataset) 2 frames in make_char_vector(self, max_sent_len, max_word_len, sentence) 23 for i, word in enumerate(nlp(sentence, disable=['parser','tagger','ner'])): 24 for j, ch in enumerate(word.text): ---> 25 char_vec[i][j] = char2idx.get(ch, 0) 26 27 return char_vec IndexError: index 191 is out of bounds for dimension 0 with size 191 ..................................................................................................................................................................................................................................................................... This is the link to the colab notebook: https://colab.research.google.com/drive/1zBn-jU_y-NbBOXR_eAi6j1lPxO-EhMSa#scrollTo=pLwqIfRtqu8k I really appreciate your reply to my issue and I hope you can help me with this too. Thanks in advance!!
sathsaraRasantha commented 3 years ago

Hey..I found the error there also. And fixed it. Now I am getting another one. Could you please send me email whenever you are free. My email : rasantha.sathsara@gmail.com

This is the error I am getting. And it seems this is the last one.

Starting training ........ Starting batch: 0

RuntimeError Traceback (most recent call last)

in () ----> 1 train(model, train_dataset) in train(model, train_dataset) 16 context, question, char_ctx, char_ques, label, ctx_text, ans, ids = batch 17 ---> 18 context, question, char_ctx, char_ques, label = context.to(device), question.to(device), char_ctx.to(device), char_ques.to(device), label.to(device) 19 20 RuntimeError: CUDA error: device-side assert triggered
Marwa-1995 commented 3 years ago

Hi, @sathsaraRasantha did you find a solution for your last question? The same error has encountered me also.

kushalj001 commented 3 years ago

Hi @sathsaraRasantha @Marwa-1995 The error is most likely related to some wrong dimension/axis in a tensor being accessed. In order to get a more readable error message, turn of your GPU and run the code on CPU. That will give the exact line where the code is breaking.