Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Apache License 2.0
2.39k
stars
248
forks
source link
Error about prompt tuning with stage2 finetuning #397
Hi, OFA team~ When using prompt tuning with stage2 finetuning for image captioning, the error happens. The error information is
...... File "/home/home/ofa/models/ofa/unify_multihead_attention.py", line 342, in forward assert key_padding_mask.size(1) == k.size(1) AssertionError
Could you give some advice? The training script similar to refcoco/train_refcoco_prefix.sh is shown below.
!/usr/bin/env
The port for communication. Note that if you want to run multiple tasks on the same machine,
you need to specify different port numbers.
export MASTER_PORT=1052
log_dir=../../train/stage2_logs save_dir=../../train/stage2_checkpoints mkdir -p $log_dir $save_dir
bpe_dir=../../utils/BPE user_dir=../../ofa_module
data_dir=../../dataset/caption_data data=${data_dir}/caption_stage2_train.tsv,${data_dir}/caption_val.tsv restore_file=../../checkpoints/caption_stage1_best.pt selected_cols=1,4,2
prompt_type_method=prefix encoder_prompt_length=100 decoder_prompt_length=100
task=caption arch=ofa_large criterion=scst_reward_criterion label_smoothing=0.1 lr=1e-5 max_epoch=5 warmup_ratio=0.06 batch_size=1 #2 update_freq=4 resnet_drop_path_rate=0.0 encoder_drop_path_rate=0.0 decoder_drop_path_rate=0.0 dropout=0.0 attention_dropout=0.0 max_src_length=80 max_tgt_length=20 num_bins=1000 patch_image_size=480 eval_cider_cached=${data_dir}/cider_cached_tokens/coco-valid-words.p scst_cider_cached=${data_dir}/cider_cached_tokens/coco-train-words.p
for lr in 1e-5; do echo "lr "${lr} for max_epoch in 3; do echo "max_epoch "${max_epoch}
done done