Using world-logs while evaluating a model with more than one task, only keeps the world log for the last task.

facebookresearch / ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

MIT License

10.48k stars 2.09k forks source link

Trying to evaluate one mode with two tasks, while keeping the model outputs in world logs, I noticed that there was no result from one of the tasks in the world log. To reproduce this one may try running

 parlai eval_model -t wizard_of_wikipedia,babi \
--world-logs /some/path/world-log \
--num-examples 1 --model repeat_label

Running this, there is only a single line in the world-log.json file. Checking the file you can see "id": "babi:Task1k:1" which may mean that parlai is generating separate world log files for each task, but assumes same name for all of them and writes over the previous ones.

facebookresearch / ParlAI

Using world-logs while evaluating a model with more than one task, only keeps the world log for the last task. #3648