Closed Sandipan99 closed 4 years ago
Hi, I get exactly the same error as Sandipan99. It is ok for one sentence, but not for more... Thanks in advance
Thanks @Sandipan99 and @pietroDB for highlighting this error. I used the parse.sh script to parse multiple sentences in the past -- I will investigate this mistake and get back to you.
Sorry for this inconvenience, and thank you both for your interest in our work!
Hi, I am sorry for the belatedness. I am working on resolving the problem -- for now, a quick workaround is to set --eval-batch-size 1
when parsing. This enables you to parse multiple sentences.
I tried with 2 sentences of different lengths, and I have used the parse_quick.sh
script as is (with --eval-batch-size 50
). It works, and I am unable to reproduce the same issue. I have python 3.8.
Also I notice the line numbers on your output do not match what we have on the repository at the moment, could you try with the latest version of the code? Thank you!
FYI, here are the sentences I used:
This is the first sentence.
This is a much longer sentence, where there is a comma.
Thanks for getting back. I had not tried with parse_quick.sh. The issue was with parse.sh. The line numbers are not matching because I had tried to debug it myself and had inserted a few intermediate lines. Anyway, thanks for your help. I will definitely try out parse_quick.sh as you mentioned.
Thanks @Sandipan99 for highlighting this problem, please reopen this issue if you encounter any other problem!
Hi,
I used parse.sh with your pre-trained model to make inferences. I noticed that if you pass a single sentence through example_sentences.txt, it works fine. However, if you send multiple sentences in one batch it throws an error unless the sentences are of equal length.
Traceback (most recent call last): File "src_joint/main.py", line 794, in
main()
File "src_joint/main.py", line 790, in main
args.callback(args)
File "src_joint/main.py", line 717, in runparse
syntree, = parser.parse_batch(tagged_sentences)
File "/Users/cssh/LAL-Parser/src_joint/KM_parser.py", line 1803, in parse_batch
annotations, self.current_attns = self.encoder(emb_idxs, pre_words_idxs, batch_idxs, extra_content_annotations=extra_content_annotations)
File "/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, *kwargs)
File "/Users/cssh/LAL-Parser/src_joint/KM_parser.py", line 1191, in forward
res, current_attns = attn(res, batch_idxs)
File "/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(input, **kwargs)
File "/Users/cssh/LAL-Parser/src_joint/KM_parser.py", line 425, in forward
return self.layer_norm(outputs + residual), attns_padded
RuntimeError: The size of tensor a (24) must match the size of tensor b (19) at non-singleton dimension 0
In this example I used two sentences of of length 12 and the other of length 7. The problem I guess is that shorter input is being padded to match the longer sentence. But the input sequences are not which is causing dimension mismatch. I think I am missing some trick to make it work. I would appreciate your help.