google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
38.16k stars 9.6k forks source link

Start_logit/End_logit in run_squad.py #177

Open Kivin1 opened 5 years ago

Kivin1 commented 5 years ago

Hi,

I am going through the code and feel confused of the variables of start_logit and end_logit. Whats the meaning of them and are they related to the index of the answer in the paragraph?

And if i want to get the index of the answer, how can i modify? Thanks a lot.

Best Regards, kivi

pru007 commented 5 years ago

Hi kivin, Any update on this?, were you able to figure out?. Even I want to get the index of the answer. :)

-Thanks in advance Prudhvi

pru007 commented 5 years ago

Hi, orig_doc_start = feature.token_to_orig_map[pred.start_index] orig_doc_end = feature.token_to_orig_map[pred.end_index] 837 and 838 lines give you the indexes of the answer present in the paragraph/context.

best Regards, Prudhvi

bcbcbcbcbcl commented 5 years ago

In statistics, the logit function or the log-odds is the logarithm of the odds p/(1 − p) where p is the probability. It is a type of function that creates a map of probability values from [0,1] to [-∞, +∞]. It is the inverse of the sigmoidal "logistic" function or logistic transform used in mathematics, especially in statistics. In deep learning, the term logits layer is popularly used for the last neuron layer of neural network for classification task which produces raw prediction values as real numbers ranging from [-∞, +∞]. - taken from Wikipedia

Basically, it is used to determine the span of the answer (start index to end index).

As you can see in run_squad.py line 821, the predictions is inversely sorted based on the sum of start logit and end logit before select the nbest predictions. prelim_predictions = sorted( prelim_predictions, key=lambda x: (x.start_logit + x.end_logit), reverse=True)