Closed xienian87 closed 2 years ago
I tried the "run_model.py predict" command with the downloaded tapex.large model in the instruction. The result is fine.
@xienian87 Thanks for your interest on our work, could you confirm that the checkpoint is successfully loaded when predicting? You may find the log and paste it here, and I will help check that.
@SivilTaram Hello, I figure out the problem but don't know how to solve it. I test all the saved checkpoints and find out that some of them give very close answer and some of them give completely wrong answer. None of them gives correct answer ("2004" in the demo). It seems that models are overfitted. Can you provide some suggestion on how to train the model, please. Thanks very much.
One thing I don't understand is that the model works fine in "run_model.py eval" command, but not in "run_model.py predict" command. If the model is overfitted, how can it works fine in eval?
below is the log message on terminal: 2022-08-16 11:20:24 | INFO | fairseq.file_utils | loading archive file tapex.base 2022-08-16 11:20:26 | INFO | fairseq.tasks.translation | [src] dictionary: 51200 types 2022-08-16 11:20:26 | INFO | fairseq.tasks.translation | [tgt] dictionary: 51200 types 2022-08-16 11:20:58 | INFO | main | Receive question as : Greece held its last Summer Olympics in which year? 2022-08-16 11:20:58 | INFO | main | The answer should be : 9
@xienian87 If I understand it correctly, you directly use the pre-trained tapex.base
on downstream questions, instead of the fine-tuned models? If the evaluating result seems promising, how about trying to predict all answers of evaluated examples, and then see if the model satisfies your expectation (but not for the demo question only). Thanks!
@SivilTaram I use the fine-tuned models on wikisql dataset. I will predict all answers of evaluated examples and also train a large model to check its predict performance.
@xienian87 Hi, how about the quality of predictions? Is the prediction results expected?
Closed since there is no more activity. Feel free to re-open when having more questions.
Hello,
I follow the instruction and trained a tapex.base model on wikisql dataset. The "run_model.py eval" reports Denotation Accuracy : 0.879. However when I use ''run_model.py predict" command to perform online prediction, the result is very poor. Is there any thing I can try to sovle this problem?
Thanks
Nian