fastforwardlabs / ff14_blog

Apache License 2.0

12 stars 4 forks source link

Evaluating QA: Metrics, Predictions, and the Null Response | NLP for Question Answering #8

Open utterances-bot opened 3 years ago

utterances-bot commented 3 years ago

Evaluating QA: Metrics, Predictions, and the Null Response | NLP for Question Answering

A deep dive into computing QA predictions and when to tell BERT to zip it!

https://qa.fastforwardlabs.com/no%20answer/null%20threshold/bert/distilbert/exact%20match/f1/robust%20predictions/2020/06/09/Evaluating_BERT_on_SQuAD.html

learnercan commented 3 years ago

Hi, First of all awesome explanation !! Keep it up. I have one issue while running your code: ValueError: 102 is not in list at

question tokens are between the CLS token (101, at position 0) and first SEP (102) token

question_indexes = [i+1 for i, token in enumerate(tokens[1:tokens.index(102)])]

Please note that I am trying it on your example only.

My query is you are using this to locate the [SEP] token in the indexes.

1) Can't we directly use 3 as it is the token id of [SEP] instead of 102. 2) Here, what is the meaning of 102.

I am new to all this, so if the questions sounds crazy to you please bare with me.

giffarialfarizy commented 3 years ago

https://raw.githubusercontent.com/huggingface/transformers/master/examples/question-answering/run_squad.py

404: Not Found

brgsk commented 2 years ago

@giffarialfarizy HuggingFace's Transformers repo has changed a little since 2020. The run_squad.py script is now in examples/legacy/question-answering directory. Here's a working link.

brgsk commented 2 years ago

@learnercan 102 is sep_token_id of this particular tokenizer used in the article. Different tokenizers have different special tokens' ids :)