Closed dipanjanS closed 3 years ago
Also if I look back at my code,
!pip install transformers==2.11.0
Still works for me with a larger context (same code as above). Any idea which is the default model being used there and if that would still work for transfomers 3.x ?
@LysandreJik , @sshleifer would be great if you could look into this, assign this to the right folks.
Assigned @mfuntowicz, the master of pipelines. He's in holidays right now, so I'll try to look into it in the coming days.
It isn't just long contexts. I was running some QA on SQuAD2.0 and came across an instance where I received that error for a given context and question but the context is not that long.
from transformers import pipeline
model_path = "twmkn9/distilbert-base-uncased-squad2"
hfreader = pipeline('question-answering', model=model_path, tokenizer=model_path, device=0)
context = """
The Norman dynasty had a major political, cultural and military impact on
medieval Europe and even the Near East. The Normans were famed for their
martial spirit and eventually for their Christian piety, becoming exponents of
the Catholic orthodoxy into which they assimilated. They adopted the
Gallo-Romance language of the Frankish land they settled, their dialect
becoming known as Norman, Normaund or Norman French, an important literary
language. The Duchy of Normandy, which they formed by treaty with the French
crown, was a great fief of medieval France, and under Richard I of Normandy was
forged into a cohesive and formidable principality in feudal tenure. The
Normans are noted both for their culture, such as their unique Romanesque
architecture and musical traditions, and for their significant military
accomplishments and innovations. Norman adventurers founded the Kingdom of
Sicily under Roger II after conquering southern Italy on the Saracens and
Byzantines, and an expedition on behalf of their duke, William the Conqueror,
led to the Norman conquest of England at the Battle of Hastings in 1066. Norman
cultural and military influence spread from these new European centres to the
Crusader states of the Near East, where their prince Bohemond I founded the
Principality of Antioch in the Levant, to Scotland and Wales in Great Britain,
to Ireland, and to the coasts of north Africa and the Canary Islands.
"""
question2 = "Who assimilted the Roman language?"
hfreader(question=question2, context=context)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-144-45135f680e80> in <module>()
----> 1 hfreader(question=question2, context=context)
1 frames
/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <listcomp>(.0)
1314 ),
1315 }
-> 1316 for s, e, score in zip(starts, ends, scores)
1317 ]
1318
KeyError: 0
But if I changed the question and keep the same context, the pipeline completes the execution.
question1 = "Who was famed for their Christian spirit?"
hfreader(question=question1, context=context)
{'answer': 'Normans', 'end': 127, 'score': 0.5337043597899815, 'start': 120}
Thanks @melaniebeck for this, even I encountered this just earlier today. Would definitely be great if the team can figure out how these could be resolved in v3.x for transformers.
i also encountered this issue (keyerror : 0)
it's not even long text (about 8-12 words length)
sometime it occured when i'm changing some word in the question with oov word
rv = self.dispatch_request()
0|QA | File "/home/samsul/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request
0|QA | return self.view_functions[rule.endpoint](**req.view_args)
0|QA | File "/home/samsul/question-answering/app.py", line 23, in search
0|QA | answer = nlp({'question': question,'context': context})
0|QA | File "/home/samsul/.local/lib/python3.6/site-packages/transformers/pipelines.py", line 1316, in __call__
0|QA | for s, e, score in zip(starts, ends, scores)
0|QA | File "/home/samsul/.local/lib/python3.6/site-packages/transformers/pipelines.py", line 1316, in <listcomp>
0|QA | for s, e, score in zip(starts, ends, scores)
0|QA | KeyError: 0
Hello! There has been a few fixes on the pipelines since version v3.0.2 came out. I can reproduce this issue on v3.0.1 and v3.0.2, but not on the master branch, as it has probably been fixed already.
Could you try installing from source (pip install git+https://github.com/huggingface/transformers
) and let me know if that fixes your issue?
hi @LysandreJik
seems the problem still occurred but now its keyerror 17
input
!pip install git+https://github.com/huggingface/transformers
from transformers import pipeline
nlp = pipeline('question-answering',model='a-ware/xlmroberta-squadv2',device=0)
nlp({'question': "siapa istri samsul?",'context': "nama saya samsul, saya adalah suami raisa"})
Error
/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in __call__(self, *args, **kwargs)
1676 ),
1677 }
-> 1678 for s, e, score in zip(starts, ends, scores)
1679 ]
1680
/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <listcomp>(.0)
1676 ),
1677 }
-> 1678 for s, e, score in zip(starts, ends, scores)
1679 ]
1680
KeyError: 17
i also try the case from @dipanjanS (the first post)
still got some error:
/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <dictcomp>(.0)
1636 with torch.no_grad():
1637 # Retrieve the score for the context tokens only (removing question tokens)
-> 1638 fw_args = {k: torch.tensor(v, device=self.device) for (k, v) in fw_args.items()}
1639 start, end = self.model(**fw_args)[:2]
1640 start, end = start.cpu().numpy(), end.cpu().numpy()
ValueError: expected sequence of length 384 at dim 1 (got 317)
https://github.com/huggingface/transformers/blob/f6cb0f806efecb64df40c946dacaad0adad33d53/src/transformers/pipelines.py#L1618 is causing this issue. Padding to max_length solves this problem. Currently, if the text is long, the final span is not padded to the max_seq_len of the model.
Yes agreed I think that is related to the recent code push based on the PR linked earlier. Would be great if this could be looked into HF team!
On Tue, Aug 11, 2020 at 11:18 PM Binoy Dalal notifications@github.com wrote:
https://github.com/huggingface/transformers/blob/f6cb0f806efecb64df40c946dacaad0adad33d53/src/transformers/pipelines.py#L1618 https://mailtrack.io/trace/link/26fa516997f20e87e713b4c04065c74bbadf3226?url=https%3A%2F%2Fgithub.com%2Fhuggingface%2Ftransformers%2Fblob%2Ff6cb0f806efecb64df40c946dacaad0adad33d53%2Fsrc%2Ftransformers%2Fpipelines.py%23L1618&userId=3535544&signature=c1f087ce57177138 is causing this issue. Padding to max_length solves this problem. Currently, if the text is long, the final span is not padded to the max_seq_len of the model.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://mailtrack.io/trace/link/4b6aa40826e8c36d7aebe9207d4f60b6bd245a74?url=https%3A%2F%2Fgithub.com%2Fhuggingface%2Ftransformers%2Fissues%2F6144%23issuecomment-672130943&userId=3535544&signature=a98099ac20ab30b6, or unsubscribe https://mailtrack.io/trace/link/2613e10aaae39303a4e72607615d815ac84ac486?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAA2J3R3U2QJ6XORN26GC5JTSAF767ANCNFSM4PMDZHVQ&userId=3535544&signature=aef7c99cd5c66eaa .
Awesome thanks folks!
Transformers version: 3.0.2
The question-answering models don't seem to work anymore with long text, any reason why this is happening? I have tried with the default model in
pipeline
as well as with specific models.e.g
Sample Code:
Error Message:
This used to work before version 3 I remember, would really appreciate some help on this.