zhulifengsheng / fairseq_mmt

MIT License
4 stars 9 forks source link

not enough values to unpack (expected 3, got 2) #1

Open memory4963 opened 2 years ago

memory4963 commented 2 years ago

Hi, thank you very much for the excellent work!

However, I am facing an error when I try to train the network with train_mmt.sh:

Traceback (most recent call last):                                                                                                         
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/tiger/.local/share/code-server/extensions/ms-python.python-2020.10.332292344/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
    cli.main()
  File "/home/tiger/.local/share/code-server/extensions/ms-python.python-2020.10.332292344/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/home/tiger/.local/share/code-server/extensions/ms-python.python-2020.10.332292344/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 267, in run_file
    runpy.run_path(options.target, run_name=compat.force_str("__main__"))
  File "/usr/lib/python3.7/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/lib/python3.7/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "fairseq_cli/train.py", line 358, in <module>
    cli_main()
  File "fairseq_cli/train.py", line 354, in cli_main
    distributed_utils.call_main(args, main)
  File "/opt/tiger/luoao/fairseq_mmt/fairseq/distributed_utils.py", line 301, in call_main
    main(args, **kwargs)
  File "fairseq_cli/train.py", line 125, in main
    valid_losses, should_stop = train(args, trainer, task, epoch_itr)
  File "/usr/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "fairseq_cli/train.py", line 209, in train
    log_output = trainer.train_step(samples)
  File "/usr/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/opt/tiger/luoao/fairseq_mmt/fairseq/trainer.py", line 486, in train_step
    ignore_grad=is_dummy_batch,
  File "/opt/tiger/luoao/fairseq_mmt/fairseq/tasks/fairseq_task.py", line 417, in train_step
    loss, sample_size, logging_output = criterion(model, sample)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1112, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/tiger/luoao/fairseq_mmt/fairseq/criterions/label_smoothed_cross_entropy.py", line 69, in forward
    net_output = model(**sample["net_input"])
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1112, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/tiger/luoao/fairseq_mmt/fairseq/models/image_multimodal_transformer_SA.py", line 286, in forward
    img_masks_list=img_masks_list, imgs_list=imgs_list,
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1112, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/tiger/luoao/fairseq_mmt/fairseq/models/image_multimodal_transformer_SA.py", line 520, in forward
    xs.append(self.fuse_img_feat(x, idx, img, img_mask, text_mask=src_tokens.ne(self.padding_idx)))
  File "/opt/tiger/luoao/fairseq_mmt/fairseq/models/image_multimodal_transformer_SA.py", line 424, in fuse_img_feat
    output, _map = self.selective_attns[idx](query=text, key=image, value=image, key_padding_mask=image_mask)   # t, b, c
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1112, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/tiger/luoao/fairseq_mmt/fairseq/modules/selective_attention.py", line 41, in forward
    Tk, Bk, Ck = key.shape
ValueError: not enough values to unpack (expected 3, got 2)

I followed your instruction for preprocessing data, and the only thing I changed is in train_mmt.sh:

image_feat=vit_tiny_patch16_384 -> image_feat=vit_base_patch16_384 mask_data=mask0 -> mask_data=mask1

My data tree is shown below:

data
├── dict.en2de_mask1.txt
├── dict.en2de_mask2.txt
├── dict.en2de_mask3.txt
├── dict.en2de_mask4.txt
├── dict.en2de_maskc.txt
├── dict.en2de_maskp.txt
├── masking
│   ├── create_masking_multi30k.py
│   ├── data
│   │   ├── en-de
│   │   │   ├── multi30k.color.bpe.position
│   │   │   ├── multi30k.noun.bpe.position
│   │   │   ├── multi30k.nouns.bpe.position
│   │   │   ├── multi30k.people.bpe.position
│   │   │   └── origin2bpe.en-de.match
│   │   ├── en-fr
│   │   │   ├── multi30k.color.bpe.position
│   │   │   ├── multi30k.noun.bpe.position
│   │   │   ├── multi30k.nouns.bpe.position
│   │   │   ├── multi30k.people.bpe.position
│   │   │   └── origin2bpe.en-fr.match
│   │   ├── multi30k.color.position
│   │   ├── multi30k.noun.position
│   │   ├── multi30k.nouns.position
│   │   ├── multi30k.people.position
│   │   ├── noun.en
│   │   └── nouns.en
│   └── match_origin2bpe_position.py
├── multi30k
│   ├── multi30k.en
│   ├── multi30k-en-de.bpe.en
│   ├── multi30k-en-fr.bpe.en
│   ├── test.2016.de
│   ├── test.2016.en
│   ├── test.2016.fr
│   ├── test.2017.de
│   ├── test.2017.en
│   ├── test.2017.fr
│   ├── test.coco.de
│   ├── test.coco.en
│   ├── test.coco.fr
│   ├── train.de
│   ├── train.en
│   ├── train.fr
│   ├── valid.de
│   ├── valid.en
│   └── valid.fr
├── multi30k-en-de
│   ├── code
│   ├── test.2016.de
│   ├── test.2016.en
│   ├── test.2017.de
│   ├── test.2017.en
│   ├── test.coco.de
│   ├── test.coco.en
│   ├── train.de
│   ├── train.en
│   ├── valid.de
│   └── valid.en
├── multi30k-en-de.mask1
│   ├── test.2016.de
│   ├── test.2016.en
│   ├── test.2017.de
│   ├── test.2017.en
│   ├── test.coco.de
│   ├── test.coco.en
│   ├── train.de
│   ├── train.en
│   ├── valid.de
│   └── valid.en
└── vit_base_patch16_384
    ├── test1.pth
    ├── test.pth
    ├── train.pth
    └── valid.pth
data-bin/
├── multi30k.en-de.mask1
│   ├── dict.de.txt
│   ├── dict.en.txt
│   ├── preprocess.log
│   ├── test1.en-de.de.bin
│   ├── test1.en-de.de.idx
│   ├── test1.en-de.en.bin
│   ├── test1.en-de.en.idx
│   ├── test2.en-de.de.bin
│   ├── test2.en-de.de.idx
│   ├── test2.en-de.en.bin
│   ├── test2.en-de.en.idx
│   ├── test.en-de.de.bin
│   ├── test.en-de.de.idx
│   ├── test.en-de.en.bin
│   ├── test.en-de.en.idx
│   ├── train.en-de.de.bin
│   ├── train.en-de.de.idx
│   ├── train.en-de.en.bin
│   ├── train.en-de.en.idx
│   ├── valid.en-de.de.bin
│   ├── valid.en-de.de.idx
│   ├── valid.en-de.en.bin
│   └── valid.en-de.en.idx

I didn't show mask 2-4, c, p and fr for simplicity.

Could you tell me how to fix it?

By the way, I have 2 more questions want to confirm with you:

  1. I only found data/dict.en2de_mask*.txt but no data/dict.en2fr_mask*.txt, should I use these files to preprocess fr text too?
  2. I found the mask_data in train_mmt.sh is set to mask0 defaultly, but in the README, the choices of mask do not contain mask0, as shown below. Could you tell me what's the meaning of mask0 and how to generate its data?
zhulifengsheng commented 2 years ago

I'm sorry, the README.md is too brief.

The reason for the error is that the image feature's shape isn't 3-dimension. You can see the correct feature shape image_feat_shape.png and read the scripts/README.md before extracting the image feature.

Question1: data/dict.en2fr_mask*.txt will be updated. Question2: the meaning of mask0 is training models on origin multi30k text (no mask token).

memory4963 commented 2 years ago

Thank you very much and sorry for my late reply.

Actually, I have followed the scripts/README.md, maybe I did something wrong? I will check my code again and will tell you if it still has errors.

BTW, about the data/dict.en2fr_mask*.txt, maybe you should also change README.md and preprocess_mmt.sh? in the last line:

  --srcdict data/dict.en2de_$mask.txt