jiasenlu / vilbert_beta

470 stars 96 forks source link

subprocess.CalledProcessError #50

Open 821736960 opened 4 years ago

821736960 commented 4 years ago

Hi, i want to use the pretrained model and fine-tune for VQA and i just run the commands as you provide :

python -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --node_rank=0 train_tasks.py --bert_model bert-base-uncased --from_pretrained save/bert_base_6_layer_6_connect_freeze_0/pytorch_model_8.bin --config_file config/bert_base_6layer_6conect.json --learning_rate 4e-5 --num_workers 16 --tasks 0 --save_name pretrained

but an error appears:

Traceback (most recent call last): File "/cluster/home/chenjinjie/.conda/envs/vilbert/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/cluster/home/chenjinjie/.conda/envs/vilbert/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/cluster/home/chenjinjie/.conda/envs/vilbert/lib/python3.6/site-packages/torch/distributed/launch.py", line 263, in main() File "/cluster/home/chenjinjie/.conda/envs/vilbert/lib/python3.6/site-packages/torch/distributed/launch.py", line 259, in main cmd=cmd) subprocess.CalledProcessError: Command '['/cluster/home/chenjinjie/.conda/envs/vilbert/bin/python', '-u', 'train_tasks.py', '--local_rank=0', '--bert_model', 'bert-base-uncased', '--from_pretrained', 'save/bert_base_6_layer_6_connect_freeze_0/pytorch_model_8.bin', '--config_file', 'config/bert_base_6layer_6conect.json', '--learning_rate', '4e-5', '--num_workers', '16', '--tasks', '0', '--save_name', 'pretrained']' died with <Signals.SIGABRT: 6>.

could you help? thanks