Closed isaacmg closed 5 years ago
Looks like they bumped the version of ms_marco from 1.1 to 2.1. The urls in parlai/tasks/ms_marco/build.py need to be updated.
Okay I fixed the urls to 2.1. However, now I'm getting this error. I'm guessing that there is some bad line in the new dataset that is throwing since the code seems to run for awhile.
Traceback (most recent call last):
File "examples/train_model.py", line 18, in <module>
TrainLoop(opt).train()
File "/content/ParlAI/parlai/scripts/train_model.py", line 185, in __init__
self.world = create_task(opt, self.agent)
File "/content/ParlAI/parlai/core/worlds.py", line 1003, in create_task
world = create_task_world(opt, user_agents, default_world=default_world)
File "/content/ParlAI/parlai/core/worlds.py", line 973, in create_task_world
opt, user_agents, default_world=default_world)
File "/content/ParlAI/parlai/core/worlds.py", line 938, in _get_task_world
task_agents = _create_task_agents(opt)
File "/content/ParlAI/parlai/core/agents.py", line 635, in _create_task_agents
return create_task_agent_from_taskname(opt)
File "/content/ParlAI/parlai/core/agents.py", line 589, in create_task_agent_from_taskname
task_agents = teacher_class(opt)
File "/content/ParlAI/parlai/tasks/ms_marco/agents.py", line 40, in __init__
opt['datafile'] = _path(opt, is_passage=False)
File "/content/ParlAI/parlai/tasks/ms_marco/agents.py", line 18, in _path
build(opt)
File "/content/ParlAI/parlai/tasks/ms_marco/build.py", line 81, in build
create_fb_format(dpath, "train", os.path.join(dpath, 'train.gz'))
File "/content/ParlAI/parlai/tasks/ms_marco/build.py", line 42, in create_fb_format
d["passage_text"] for d in dic["passages"] if d["is_selected"] == 1
File "/content/ParlAI/parlai/tasks/ms_marco/build.py", line 42, in <listcomp>
d["passage_text"] for d in dic["passages"] if d["is_selected"] == 1
TypeError: string indices must be integers
Fixed by #1395
Hi I'm getting an error when trying to build the ms_marco dataset. Experienced the problem on both OS X and Ubuntu. Command run
Error message