facebookresearch / ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
https://parl.ai
MIT License
10.48k stars 2.1k forks source link

Training BB2 on wizard_of_interent and msc: index 1 is out of bounds for dimension 0 with size 1 #3940

Closed jytime closed 3 years ago

jytime commented 3 years ago

Hi,

I was trying to reproduce the blenderbot2 project. I used the following script to train a BB2 model, with an initialization from BB1. I used 'dpr' as the rag-retriever-type here because my server does not support the internet connection.

parlai train_model --model projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent \
--task blended_skill_talk:all,msc,wizard_of_internet --include-last-session True \
--multitask-weights 3.0,1.0,1.0,1.0,1.0,1.0,1.0 \
--init-model zoo:blender/blender_3B/model --dict-file zoo:blender/blender_3B/model.dict --fp16 True \
--datatype train:stream --batchsize 16 --embedding-size 2560 --ffn-size 10240 --dropout 0.0 \
--attention-dropout 0.0 --n-heads 32 --learn-positional-embeddings False --embeddings-scale True \
--n-positions 128 --variant prelayernorm --activation relu --n-encoder-layers 2 \
--n-decoder-layers 24 --model-parallel True --generation-model transformer/generator \
--query-model bert_from_parlai_rag \
--rag-model-type token --rag-retriever-type dpr \
--max-doc-token-length 64 --dpr-model-file zoo:hallucination/bart_rag_token/model --beam-size 10 \
--beam-min-length 20 --beam-context-block-ngram 3 --beam-block-ngram 3 --beam-block-full-context False \
--inference beam --fp16 True --fp16-impl mem_efficient --optimizer mem_eff_adam --learningrate 1e-05 \
--truncate 128 --text-truncate 128 --label-truncate 128 --delimiter '  ' \
--history-add-global-end-token end --dict-tokenizer bytelevelbpe \
--bpe-vocab /home/ParlAI/data/models/blender/blender_3B/model.dict-vocab.json \
--bpe-merge /home/ParlAI/data/models/blender/blender_3B/model.dict-merges.txt \
--warmup-updates 100 --search-query-generator-model-file zoo:blenderbot2/query_generator/model \
--search-query-generator-beam-min-length 2 --memory-key personas \
--gold-document-titles-key select-docs-titles --insert-gold-docs True \
--model-file /home/ParlAI/parlai_exps/init_bb2 

However, it seems there is something wrong with the memory reading and writing process. The error log is:

loading: /home/ParlAI/data/msc/msc/msc_dialogue/session_2
Traceback (most recent call last):
  File "/home/anaconda3/envs/new_ble/bin/parlai", line 33, in <module>
    sys.exit(load_entry_point('parlai', 'console_scripts', 'parlai')())
  File "/home/ParlAI/parlai/__main__.py", line 14, in main
    superscript_main()
  File "/home/ParlAI/parlai/core/script.py", line 325, in superscript_main
    return SCRIPT_REGISTRY[cmd].klass._run_from_parser_and_opt(opt, parser)
  File "/home/ParlAI/parlai/core/script.py", line 108, in _run_from_parser_and_opt
    return script.run()
  File "/home/ParlAI/parlai/scripts/train_model.py", line 933, in run
    return self.train_loop.train()
  File "/home/ParlAI/parlai/scripts/train_model.py", line 897, in train
    for _train_log in self.train_steps():
  File "/home/ParlAI/parlai/scripts/train_model.py", line 804, in train_steps
    world.parley()
  File "/home/ParlAI/parlai/core/worlds.py", line 865, in parley
    batch_act = self.batch_act(agent_idx, batch_observations[agent_idx])
  File "/home/ParlAI/parlai/core/worlds.py", line 833, in batch_act
    batch_actions = a.batch_act(batch_observation)
  File "/home/ParlAI/parlai/core/torch_agent.py", line 2234, in batch_act
    output = self.train_step(batch)
  File "/home/ParlAI/parlai/core/torch_generator_agent.py", line 734, in train_step
    loss = self.compute_loss(batch)
  File "/home/ParlAI/projects/blenderbot2/agents/blenderbot2.py", line 833, in compute_loss
    loss, output = super().compute_loss(batch, return_output=True)
  File "/home/ParlAI/parlai/agents/rag/rag.py", line 885, in compute_loss
    model_output = self.get_model_output(batch)
  File "/home/ParlAI/parlai/agents/rag/rag.py", line 858, in get_model_output
    *self._model_input(batch), ys=batch.label_vec
  File "/home/anaconda3/envs/new_ble/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ParlAI/parlai/core/torch_generator_agent.py", line 309, in forward
    encoder_states = prev_enc if prev_enc is not None else self.encoder(*xs)
  File "/home/ParlAI/projects/blenderbot2/agents/modules.py", line 816, in encoder
    segments,
  File "/home/ParlAI/projects/blenderbot2/agents/modules.py", line 223, in encoder
    num_memory_decoder_vecs,
  File "/home/ParlAI/projects/blenderbot2/agents/modules.py", line 382, in retrieve_and_concat
    generated_memories,
  File "/home/ParlAI/projects/blenderbot2/agents/modules.py", line 556, in access_long_term_memory
    for batch_id, mem_id in enumerate(indices)
  File "/home/ParlAI/projects/blenderbot2/agents/modules.py", line 556, in <dictcomp>
    for batch_id, mem_id in enumerate(indices)
IndexError: index 1 is out of bounds for dimension 0 with size 1

May I ask how to specify the correct BB2 training command? Any suggestion would be much appreciated!

klshuster commented 3 years ago

Hi there - the BB2 model is not meant to be trained with a memory decoder (this is usually only for inference/interactive); try setting --memory-decoder-model-file '' and training again