Open renatoviolin opened 5 years ago
@renatoviolin you need to your own method similar to get_text_span
method for that. Here prediction gives only the start and end indexes. keep the tokens_dict
, after the prediction you can use it for getting answer text.
@graviraja I was working on this and create a function to do this. I'm trying to predict from my on dataset, and at prediction time I don't have the long_answers_candidate. In this case the predictions are always 0 for start/end logits.
I tried to create various spans from the "context" and put it as candidates, but the predictions are very poor. I put the codes here (with all adjusts to run in python 3): https://github.com/renatoviolin/bert-nq-python3
What is the correct way to make predictions?
@renatoviolin you need to store the tokens_map
for converting predictions to text. Ideally for prediction tokens_map
is not required, so it is not maintained during the predictions. But for getting the answer text you need tokens_map
. Try to keep it in a variable and use it later.
You need to create dummy long answer candidates so that the answer can be predicted from these. For example, you have a paragraph from which you need to predict the answer. Add the appropriate HTML tokens and then tokenize it and make that as a long answer candidate, which can be used in the prediction function.
How can I get the predicted text from the start/end tokens?