TheAtticusProject / cuad

CUAD (NeurIPS 2021)
https://www.atticusprojectai.org/cuad
388 stars 119 forks source link

Why is test dataset (test.json) labeled? #10

Open ShuJackson opened 3 years ago

ShuJackson commented 3 years ago

The "--predict_file ./data/test.json" file is labeled with questions and answers, and it's passed directly into predictions = compute_predictions_logits() for predictions in train.py.

If I want to use your model to do predictions on my own dataset, do I also need to label it in the same json format? Doesn't that defeat the purpose? Let me know if I am misunderstanding, but shouldn't the model predict on unlabeled, raw text file?

Thanks!

berikohen commented 2 years ago

The "--predict_file ./data/test.json" file is labeled with questions and answers, and it's passed directly into predictions = compute_predictions_logits() for predictions in train.py.

If I want to use your model to do predictions on my own dataset, do I also need to label it in the same json format? Doesn't that defeat the purpose? Let me know if I am misunderstanding, but shouldn't the model predict on unlabeled, raw text file?

Thanks!

@ShuJackson I'm also facing this issue. Were you able to figure it out?