Closed Arjunsankarlal closed 5 years ago
No you don't need to have answers or supporting facts in your json. Simply prepare a file that has the same format as hotpot_dev_distractor_v1.json
but without the answers and supporting facts. Let's say the file is my_test.json
. Then you preprocess it:
python main.py --mode prepro --data_file my_test.json --para_limit 2250 --data_split test
and then make predictions
CUDA_VISIBLE_DEVICES=0 python main.py --mode test --data_split test --para_limit 2250 --batch_size 24 --init_lr 0.1 \
--keep_prob 1.0 --sp_lambda 1.0 --save HOTPOT-20180924-160521 --prediction_file dev_distractor_pred.json
@kimiyoung Thanks for the timely reply. So the format of the my_test.json should be id, question, context.
And a small clarification, In the second command, you have mentioned for prediction contains an argument
--prediction_file dev_distractor_pred.json
So is it my_test.json or some other file that we have to generate or prepare?
@Arjunsankarlal I think this is the file to write predictions to. See: https://github.com/hotpotqa/hotpot/blob/master/run.py#L273 and https://github.com/hotpotqa/hotpot/blob/master/run.py#L222
Closing this issue for now, feel free to reopen should you have further questions!
After training a model, how to make predictions for the dataset I have (say I have a query and list of paragraphs which might contain the answer).
As far as I read through the readme file, the prediction file format was given. But prediction means that the model predicts the answer for the query with the supporting facts or the data it has. But as far as I understood, it was like I should have a file which has the ans, query and the supporting facts to evaluate the model.
Also it would be helpful if you can upload the JSON, pickle files (generated while preprocessing) and trained model for the dataset, as all of them would not have sufficient hardware to generate them or train the model.