Open 311dada opened 3 years ago
In a realistic setting, the context should be generated.
But there is something strange that happened in the evaluation with your given checkpoint. Specifically, in the Response Generation setup, I get the same result as issue #4. The weird thing is I get a somewhat lower result after giving the mode gold context (gold response). In particular, I use the following command
python train.py -mode test -cfg eval_load_path=$path use_true_prev_bspn=True use_true_prev_aspn=True use_true_db_pointer=True use_true_prev_resp=True use_true_curr_bspn=True use_true_curr_aspn=True use_all_previous_context=True cuda_device=0
And I get a result of match: 96.10 success: 90.80 bleu: 22.06 score: 115.51
on the test split. Am I wrong? Could you explain to me? Thanks for your help!
The same question. @TonyNemo
The dialogue context should be in oracle or generated ?