facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.49k stars 935 forks source link

Exception: process 0 terminated with signal SIGKILL #1262

Open Huangzhw0221 opened 2 years ago

Huangzhw0221 commented 2 years ago

Hi,

While I am trying the training code with m4c_captioner model, I am getting the following error,

/home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_USER_DIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, 2022-09-08T19:36:20 | mmf.utils.configuration: Overriding option datasets to textcaps 2022-09-08T19:36:20 | mmf.utils.configuration: Overriding option model to m4c_captioner 2022-09-08T19:36:20 | mmf.utils.configuration: Overriding option config to projects/m4c_captioner/configs/m4c_captioner/textcaps/defaults.yaml 2022-09-08T19:36:20 | mmf.utils.configuration: Overriding option env.save_dir to ./save/m4c_captioner/defaults 2022-09-08T19:36:20 | mmf.utils.configuration: Overriding option run_type to train_val /home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_LOG_DIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, /home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_REPORT_DIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, /home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_TENSORBOARD_LOGDIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, /home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_WANDB_LOGDIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, /home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_USER_DIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, 2022-09-08T19:36:25 | mmf.utils.distributed: XLA Mode:False 2022-09-08T19:36:25 | mmf.utils.distributed: Distributed Init (Rank 0): tcp://localhost:14337 /home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_USER_DIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, 2022-09-08T19:36:25 | mmf.utils.distributed: XLA Mode:False 2022-09-08T19:36:25 | mmf.utils.distributed: Distributed Init (Rank 3): tcp://localhost:14337 2022-09-08T19:36:25 | mmf.utils.distributed: Initialized Host root1-Super-Server as Rank 3 /home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_USER_DIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, /home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/omegaconf/grammar_visitor.py:257: UserWarning: In the sequence MMF_USER_DIR, some elements are missing: please replace them with empty quoted strings. See https://github.com/omry/omegaconf/issues/572 for details. category=UserWarning, 2022-09-08T19:36:26 | mmf.utils.distributed: XLA Mode:False 2022-09-08T19:36:26 | mmf.utils.distributed: Distributed Init (Rank 1): tcp://localhost:14337 2022-09-08T19:36:26 | mmf.utils.distributed: XLA Mode:False 2022-09-08T19:36:26 | mmf.utils.distributed: Distributed Init (Rank 2): tcp://localhost:14337 2022-09-08T19:36:26 | mmf.utils.distributed: Initialized Host root1-Super-Server as Rank 1 2022-09-08T19:36:26 | mmf.utils.distributed: Initialized Host root1-Super-Server as Rank 2 2022-09-08T19:36:26 | mmf.utils.distributed: Initialized Host root1-Super-Server as Rank 0 2022-09-08T19:36:30 | mmf: Logging to: ./save/m4c_captioner/defaults/train.log 2022-09-08T19:36:30 | mmf_cli.run: Namespace(config_override=None, local_rank=None, opts=['datasets=textcaps', 'model=m4c_captioner', 'config=projects/m4c_captioner/configs/m4c_captioner/textcaps/defaults.yaml', 'env.save_dir=./save/m4c_captioner/defaults', 'run_type=train_val']) 2022-09-08T19:36:30 | mmf_cli.run: Torch version: 1.6.0 2022-09-08T19:36:30 | mmf.utils.general: CUDA Device 0 is: NVIDIA GeForce GTX 1080 Ti 2022-09-08T19:36:30 | mmf_cli.run: Using seed 30613796 2022-09-08T19:36:30 | mmf.trainers.mmf_trainer: Loading datasets loading configuration file https://huggingface.co/bert-base-uncased/resolve/main/config.json from cache at /home/root1/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e Model config BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.10.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

loading configuration file https://huggingface.co/bert-base-uncased/resolve/main/config.json from cache at /home/root1/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e Model config BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.10.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

loading configuration file https://huggingface.co/bert-base-uncased/resolve/main/config.json from cache at /home/root1/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e Model config BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.10.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

loading configuration file https://huggingface.co/bert-base-uncased/resolve/main/config.json from cache at /home/root1/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e Model config BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.10.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

loading file https://huggingface.co/bert-base-uncased/resolve/main/vocab.txt from cache at /home/root1/.cache/huggingface/transformers/45c3f7a79a80e1cf0a489e5c62b43f173c15db47864303a55d623bb3c96f72a5.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99 loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer.json from cache at /home/root1/.cache/huggingface/transformers/534479488c54aeaf9c3406f647aa2ec13648c06771ffe269edabebd4c412da1d.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4 loading file https://huggingface.co/bert-base-uncased/resolve/main/added_tokens.json from cache at None loading file https://huggingface.co/bert-base-uncased/resolve/main/special_tokens_map.json from cache at None loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer_config.json from cache at /home/root1/.cache/huggingface/transformers/c1d7f0a763fb63861cc08553866f1fc3e5a6f4f07621be277452d26d71303b7e.20430bd8e10ef77a7d2977accefe796051e01bc2fc4aa146bc862997a1a15e79 loading file https://huggingface.co/bert-base-uncased/resolve/main/vocab.txt from cache at /home/root1/.cache/huggingface/transformers/45c3f7a79a80e1cf0a489e5c62b43f173c15db47864303a55d623bb3c96f72a5.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99 loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer.json from cache at /home/root1/.cache/huggingface/transformers/534479488c54aeaf9c3406f647aa2ec13648c06771ffe269edabebd4c412da1d.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4 loading file https://huggingface.co/bert-base-uncased/resolve/main/added_tokens.json from cache at None loading file https://huggingface.co/bert-base-uncased/resolve/main/vocab.txt from cache at /home/root1/.cache/huggingface/transformers/45c3f7a79a80e1cf0a489e5c62b43f173c15db47864303a55d623bb3c96f72a5.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99 loading file https://huggingface.co/bert-base-uncased/resolve/main/special_tokens_map.json from cache at None loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer.json from cache at /home/root1/.cache/huggingface/transformers/534479488c54aeaf9c3406f647aa2ec13648c06771ffe269edabebd4c412da1d.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4 loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer_config.json from cache at /home/root1/.cache/huggingface/transformers/c1d7f0a763fb63861cc08553866f1fc3e5a6f4f07621be277452d26d71303b7e.20430bd8e10ef77a7d2977accefe796051e01bc2fc4aa146bc862997a1a15e79 loading file https://huggingface.co/bert-base-uncased/resolve/main/added_tokens.json from cache at None loading file https://huggingface.co/bert-base-uncased/resolve/main/special_tokens_map.json from cache at None loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer_config.json from cache at /home/root1/.cache/huggingface/transformers/c1d7f0a763fb63861cc08553866f1fc3e5a6f4f07621be277452d26d71303b7e.20430bd8e10ef77a7d2977accefe796051e01bc2fc4aa146bc862997a1a15e79 loading configuration file https://huggingface.co/bert-base-uncased/resolve/main/config.json from cache at /home/root1/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e Model config BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.10.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

loading file https://huggingface.co/bert-base-uncased/resolve/main/vocab.txt from cache at /home/root1/.cache/huggingface/transformers/45c3f7a79a80e1cf0a489e5c62b43f173c15db47864303a55d623bb3c96f72a5.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99 loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer.json from cache at /home/root1/.cache/huggingface/transformers/534479488c54aeaf9c3406f647aa2ec13648c06771ffe269edabebd4c412da1d.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4 loading file https://huggingface.co/bert-base-uncased/resolve/main/added_tokens.json from cache at None loading file https://huggingface.co/bert-base-uncased/resolve/main/special_tokens_map.json from cache at None loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer_config.json from cache at /home/root1/.cache/huggingface/transformers/c1d7f0a763fb63861cc08553866f1fc3e5a6f4f07621be277452d26d71303b7e.20430bd8e10ef77a7d2977accefe796051e01bc2fc4aa146bc862997a1a15e79 loading configuration file https://huggingface.co/bert-base-uncased/resolve/main/config.json from cache at /home/root1/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e loading configuration file https://huggingface.co/bert-base-uncased/resolve/main/config.json from cache at /home/root1/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e Model config BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.10.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

Model config BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.10.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

loading configuration file https://huggingface.co/bert-base-uncased/resolve/main/config.json from cache at /home/root1/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e Model config BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.10.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

2022-09-08T19:36:52 | mmf.datasets.multi_datamodule: Multitasking disabled by default for single dataset training 2022-09-08T19:36:52 | mmf.datasets.multi_datamodule: Multitasking disabled by default for single dataset training 2022-09-08T19:36:52 | mmf.datasets.multi_datamodule: Multitasking disabled by default for single dataset training 2022-09-08T19:36:52 | mmf.trainers.mmf_trainer: Loading model loading weights file https://huggingface.co/bert-base-uncased/resolve/main/pytorch_model.bin from cache at /home/root1/.cache/huggingface/transformers/a8041bf617d7f94ea26d15e218abd04afc2004805632abc0ed2066aa16d50d04.faf6ea826ae9c5867d12b22257f9877e6b8367890837bd60f7c54a29633f7f2f Some weights of the model checkpoint at bert-base-uncased were not used when initializing TextBert: ['bert.encoder.layer.4.output.dense.weight', 'bert.encoder.layer.7.attention.self.value.bias', 'bert.encoder.layer.11.intermediate.dense.bias', 'bert.encoder.layer.9.output.dense.weight', 'bert.encoder.layer.7.attention.output.dense.bias', 'bert.encoder.layer.7.intermediate.dense.bias', 'bert.pooler.dense.weight', 'bert.encoder.layer.11.attention.self.query.weight', 'bert.encoder.layer.8.intermediate.dense.weight', 'bert.encoder.layer.10.attention.self.value.weight', 'cls.predictions.bias', 'bert.encoder.layer.6.attention.self.key.weight', 'bert.encoder.layer.4.attention.output.LayerNorm.bias', 'bert.encoder.layer.5.attention.self.value.bias', 'bert.encoder.layer.7.output.LayerNorm.bias', 'bert.encoder.layer.10.intermediate.dense.bias', 'bert.encoder.layer.5.attention.self.value.weight', 'bert.encoder.layer.5.attention.self.query.weight', 'bert.encoder.layer.9.attention.self.query.bias', 'bert.encoder.layer.10.attention.self.key.bias', 'bert.encoder.layer.10.attention.self.query.bias', 'bert.encoder.layer.7.intermediate.dense.weight', 'bert.encoder.layer.11.attention.self.key.weight', 'bert.encoder.layer.10.attention.output.LayerNorm.bias', 'bert.encoder.layer.5.output.LayerNorm.weight', 'bert.encoder.layer.3.output.dense.bias', 'bert.encoder.layer.3.attention.output.dense.bias', 'bert.encoder.layer.11.attention.output.LayerNorm.bias', 'bert.encoder.layer.3.output.dense.weight', 'bert.encoder.layer.7.attention.output.LayerNorm.weight', 'bert.encoder.layer.5.attention.output.dense.weight', 'bert.encoder.layer.9.attention.self.query.weight', 'cls.predictions.decoder.weight', 'bert.encoder.layer.10.output.LayerNorm.bias', 'bert.encoder.layer.6.attention.output.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'bert.encoder.layer.6.attention.output.dense.bias', 'bert.encoder.layer.7.attention.self.value.weight', 'bert.encoder.layer.6.output.dense.bias', 'bert.encoder.layer.8.intermediate.dense.bias', 'bert.encoder.layer.4.attention.self.value.weight', 'bert.encoder.layer.8.attention.self.query.bias', 'bert.encoder.layer.3.attention.self.value.bias', 'bert.encoder.layer.5.output.dense.weight', 'bert.encoder.layer.6.attention.output.dense.weight', 'bert.encoder.layer.8.attention.self.value.bias', 'bert.encoder.layer.4.output.dense.bias', 'bert.encoder.layer.7.output.dense.bias', 'bert.encoder.layer.9.attention.self.key.weight', 'bert.encoder.layer.3.attention.self.query.weight', 'bert.encoder.layer.4.attention.self.value.bias', 'bert.encoder.layer.11.output.LayerNorm.weight', 'bert.encoder.layer.3.output.LayerNorm.bias', 'bert.encoder.layer.7.attention.self.key.bias', 'bert.encoder.layer.9.attention.output.dense.weight', 'bert.encoder.layer.5.output.LayerNorm.bias', 'bert.encoder.layer.9.attention.self.key.bias', 'bert.encoder.layer.9.output.LayerNorm.weight', 'bert.encoder.layer.11.output.dense.weight', 'bert.encoder.layer.3.attention.self.key.weight', 'bert.encoder.layer.5.attention.self.query.bias', 'bert.encoder.layer.9.attention.self.value.bias', 'bert.encoder.layer.9.attention.output.LayerNorm.weight', 'bert.encoder.layer.11.output.dense.bias', 'bert.encoder.layer.4.attention.output.dense.weight', 'bert.encoder.layer.5.attention.self.key.bias', 'bert.encoder.layer.3.attention.output.dense.weight', 'bert.pooler.dense.bias', 'bert.encoder.layer.4.intermediate.dense.weight', 'bert.encoder.layer.7.attention.self.query.bias', 'bert.encoder.layer.5.output.dense.bias', 'bert.encoder.layer.3.intermediate.dense.bias', 'bert.encoder.layer.8.output.dense.bias', 'bert.encoder.layer.8.attention.output.dense.bias', 'bert.encoder.layer.5.attention.self.key.weight', 'bert.encoder.layer.8.attention.output.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'bert.encoder.layer.8.attention.self.value.weight', 'bert.encoder.layer.3.attention.output.LayerNorm.bias', 'bert.encoder.layer.10.output.dense.weight', 'bert.encoder.layer.11.attention.self.query.bias', 'bert.encoder.layer.4.output.LayerNorm.weight', 'bert.encoder.layer.9.output.dense.bias', 'bert.encoder.layer.11.intermediate.dense.weight', 'bert.encoder.layer.11.attention.output.dense.weight', 'cls.predictions.transform.LayerNorm.weight', 'bert.encoder.layer.3.output.LayerNorm.weight', 'bert.encoder.layer.4.output.LayerNorm.bias', 'bert.encoder.layer.4.attention.output.LayerNorm.weight', 'bert.encoder.layer.6.attention.self.value.weight', 'bert.encoder.layer.8.attention.output.LayerNorm.weight', 'bert.encoder.layer.4.attention.self.query.weight', 'bert.encoder.layer.3.attention.self.query.bias', 'bert.encoder.layer.5.attention.output.LayerNorm.bias', 'bert.encoder.layer.6.output.LayerNorm.weight', 'bert.encoder.layer.6.attention.self.value.bias', 'bert.encoder.layer.3.attention.self.key.bias', 'cls.predictions.transform.dense.weight', 'bert.encoder.layer.8.attention.self.query.weight', 'bert.encoder.layer.4.attention.self.query.bias', 'bert.encoder.layer.7.output.dense.weight', 'bert.encoder.layer.8.attention.self.key.weight', 'bert.encoder.layer.6.intermediate.dense.weight', 'bert.encoder.layer.11.attention.self.key.bias', 'bert.encoder.layer.7.attention.self.query.weight', 'bert.encoder.layer.8.output.LayerNorm.bias', 'bert.encoder.layer.7.attention.output.LayerNorm.bias', 'bert.encoder.layer.9.attention.self.value.weight', 'bert.encoder.layer.8.output.dense.weight', 'bert.encoder.layer.9.intermediate.dense.weight', 'cls.seq_relationship.bias', 'bert.encoder.layer.8.output.LayerNorm.weight', 'bert.encoder.layer.6.attention.self.key.bias', 'bert.encoder.layer.6.intermediate.dense.bias', 'bert.encoder.layer.4.attention.self.key.bias', 'bert.encoder.layer.4.attention.output.dense.bias', 'bert.encoder.layer.11.attention.output.dense.bias', 'bert.encoder.layer.8.attention.output.LayerNorm.bias', 'bert.encoder.layer.5.attention.output.dense.bias', 'cls.seq_relationship.weight', 'bert.encoder.layer.7.attention.self.key.weight', 'bert.encoder.layer.11.output.LayerNorm.bias', 'bert.encoder.layer.9.attention.output.LayerNorm.bias', 'bert.encoder.layer.10.attention.output.dense.weight', 'bert.encoder.layer.10.output.dense.bias', 'bert.encoder.layer.7.attention.output.dense.weight', 'bert.encoder.layer.4.attention.self.key.weight', 'bert.encoder.layer.10.output.LayerNorm.weight', 'bert.encoder.layer.7.output.LayerNorm.weight', 'bert.encoder.layer.11.attention.output.LayerNorm.weight', 'bert.encoder.layer.9.intermediate.dense.bias', 'bert.encoder.layer.6.output.dense.weight', 'bert.encoder.layer.8.attention.self.key.bias', 'bert.encoder.layer.11.attention.self.value.weight', 'bert.encoder.layer.11.attention.self.value.bias', 'bert.encoder.layer.10.attention.self.value.bias', 'bert.encoder.layer.5.intermediate.dense.weight', 'bert.encoder.layer.10.intermediate.dense.weight', 'bert.encoder.layer.9.output.LayerNorm.bias', 'bert.encoder.layer.10.attention.output.LayerNorm.weight', 'bert.encoder.layer.6.output.LayerNorm.bias', 'bert.encoder.layer.6.attention.output.LayerNorm.bias', 'bert.encoder.layer.10.attention.output.dense.bias', 'bert.encoder.layer.3.intermediate.dense.weight', 'bert.encoder.layer.6.attention.self.query.weight', 'bert.encoder.layer.3.attention.output.LayerNorm.weight', 'bert.encoder.layer.10.attention.self.key.weight', 'bert.encoder.layer.5.intermediate.dense.bias', 'bert.encoder.layer.3.attention.self.value.weight', 'bert.encoder.layer.9.attention.output.dense.bias', 'bert.encoder.layer.5.attention.output.LayerNorm.weight', 'bert.encoder.layer.4.intermediate.dense.bias', 'bert.encoder.layer.6.attention.self.query.bias', 'bert.encoder.layer.10.attention.self.query.weight']

WARNING 2022-09-08T19:36:58 | py.warnings: /media/root1/2f4dfbfa-d286-46c0-bdd8-c0caec6858d9/hzw/mmf/mmf/utils/distributed.py:412: UserWarning: No type for scheduler specified even though lr_scheduler is True, setting default to 'Pythia' builtin_warn(*args, **kwargs)

WARNING 2022-09-08T19:36:58 | py.warnings: /media/root1/2f4dfbfa-d286-46c0-bdd8-c0caec6858d9/hzw/mmf/mmf/utils/distributed.py:412: UserWarning: scheduler attributes has no params defined, defaulting to {}. builtin_warn(*args, **kwargs)

WARNING 2022-09-08T19:36:58 | py.warnings: /media/root1/2f4dfbfa-d286-46c0-bdd8-c0caec6858d9/hzw/mmf/mmf/utils/distributed.py:412: UserWarning: scheduler attributes has no params defined, defaulting to {}. builtin_warn(*args, **kwargs)

loading weights file https://huggingface.co/bert-base-uncased/resolve/main/pytorch_model.bin from cache at /home/root1/.cache/huggingface/transformers/a8041bf617d7f94ea26d15e218abd04afc2004805632abc0ed2066aa16d50d04.faf6ea826ae9c5867d12b22257f9877e6b8367890837bd60f7c54a29633f7f2f loading weights file https://huggingface.co/bert-base-uncased/resolve/main/pytorch_model.bin from cache at /home/root1/.cache/huggingface/transformers/a8041bf617d7f94ea26d15e218abd04afc2004805632abc0ed2066aa16d50d04.faf6ea826ae9c5867d12b22257f9877e6b8367890837bd60f7c54a29633f7f2f loading weights file https://huggingface.co/bert-base-uncased/resolve/main/pytorch_model.bin from cache at /home/root1/.cache/huggingface/transformers/a8041bf617d7f94ea26d15e218abd04afc2004805632abc0ed2066aa16d50d04.faf6ea826ae9c5867d12b22257f9877e6b8367890837bd60f7c54a29633f7f2f Some weights of the model checkpoint at bert-base-uncased were not used when initializing TextBert: ['bert.encoder.layer.8.attention.self.query.bias', 'cls.predictions.decoder.weight', 'bert.encoder.layer.8.attention.output.LayerNorm.weight', 'cls.seq_relationship.bias', 'bert.encoder.layer.9.output.LayerNorm.weight', 'bert.encoder.layer.6.intermediate.dense.bias', 'bert.encoder.layer.5.attention.self.query.bias', 'cls.predictions.transform.LayerNorm.weight', 'bert.encoder.layer.3.output.dense.bias', 'bert.encoder.layer.6.attention.self.value.bias', 'bert.encoder.layer.7.attention.self.query.weight', 'bert.encoder.layer.9.attention.self.query.weight', 'bert.encoder.layer.8.intermediate.dense.bias', 'bert.encoder.layer.7.attention.output.LayerNorm.weight', 'bert.encoder.layer.4.attention.self.query.bias', 'bert.encoder.layer.4.attention.self.value.bias', 'bert.encoder.layer.7.attention.output.dense.bias', 'bert.encoder.layer.11.attention.self.query.bias', 'bert.encoder.layer.7.output.LayerNorm.weight', 'bert.encoder.layer.10.output.dense.weight', 'bert.encoder.layer.7.intermediate.dense.weight', 'bert.encoder.layer.11.attention.self.query.weight', 'bert.encoder.layer.6.attention.self.query.weight', 'bert.encoder.layer.8.output.dense.bias', 'bert.encoder.layer.9.attention.output.dense.bias', 'bert.encoder.layer.5.attention.self.value.bias', 'bert.encoder.layer.11.attention.self.key.bias', 'bert.encoder.layer.7.intermediate.dense.bias', 'bert.encoder.layer.4.attention.output.dense.weight', 'bert.encoder.layer.10.attention.self.query.bias', 'bert.encoder.layer.3.output.dense.weight', 'bert.encoder.layer.3.attention.output.LayerNorm.weight', 'bert.encoder.layer.11.intermediate.dense.weight', 'bert.encoder.layer.6.output.LayerNorm.weight', 'bert.encoder.layer.8.output.LayerNorm.bias', 'bert.encoder.layer.9.attention.self.key.weight', 'bert.encoder.layer.10.attention.output.LayerNorm.bias', 'bert.encoder.layer.7.attention.self.value.weight', 'bert.encoder.layer.11.attention.self.key.weight', 'bert.encoder.layer.3.attention.self.query.bias', 'bert.encoder.layer.4.intermediate.dense.weight', 'bert.encoder.layer.5.attention.self.query.weight', 'bert.encoder.layer.3.attention.self.value.weight', 'bert.encoder.layer.4.attention.self.key.bias', 'bert.encoder.layer.8.attention.output.dense.bias', 'bert.encoder.layer.5.intermediate.dense.weight', 'bert.encoder.layer.7.attention.output.LayerNorm.bias', 'bert.encoder.layer.3.intermediate.dense.weight', 'bert.encoder.layer.10.attention.output.dense.weight', 'bert.encoder.layer.6.output.dense.weight', 'bert.encoder.layer.7.output.dense.weight', 'bert.encoder.layer.4.output.LayerNorm.bias', 'bert.encoder.layer.3.attention.self.query.weight', 'bert.encoder.layer.9.attention.output.LayerNorm.bias', 'bert.encoder.layer.9.intermediate.dense.weight', 'bert.encoder.layer.9.intermediate.dense.bias', 'bert.encoder.layer.10.attention.output.LayerNorm.weight', 'bert.encoder.layer.7.attention.self.query.bias', 'bert.encoder.layer.6.output.LayerNorm.bias', 'bert.encoder.layer.9.attention.output.dense.weight', 'bert.encoder.layer.5.output.LayerNorm.weight', 'bert.encoder.layer.8.attention.self.key.bias', 'bert.encoder.layer.8.attention.output.dense.weight', 'bert.encoder.layer.6.attention.output.LayerNorm.bias', 'bert.encoder.layer.11.intermediate.dense.bias', 'bert.encoder.layer.5.attention.self.value.weight', 'bert.encoder.layer.11.output.dense.weight', 'bert.encoder.layer.10.attention.output.dense.bias', 'bert.encoder.layer.3.attention.self.key.bias', 'bert.encoder.layer.4.output.LayerNorm.weight', 'bert.encoder.layer.9.attention.output.LayerNorm.weight', 'bert.encoder.layer.5.attention.output.dense.bias', 'bert.encoder.layer.3.attention.self.key.weight', 'bert.encoder.layer.10.attention.self.key.weight', 'bert.encoder.layer.4.attention.output.LayerNorm.weight', 'bert.encoder.layer.9.output.dense.bias', 'cls.predictions.transform.LayerNorm.bias', 'bert.encoder.layer.11.attention.output.dense.weight', 'bert.encoder.layer.7.output.dense.bias', 'bert.encoder.layer.5.attention.self.key.weight', 'bert.encoder.layer.3.attention.output.dense.bias', 'bert.encoder.layer.8.attention.self.value.bias', 'bert.encoder.layer.5.attention.self.key.bias', 'bert.encoder.layer.4.attention.self.query.weight', 'bert.encoder.layer.4.output.dense.weight', 'bert.encoder.layer.6.attention.self.query.bias', 'bert.encoder.layer.11.attention.output.dense.bias', 'bert.encoder.layer.3.intermediate.dense.bias', 'bert.encoder.layer.4.intermediate.dense.bias', 'bert.encoder.layer.3.attention.output.dense.weight', 'bert.pooler.dense.weight', 'bert.encoder.layer.7.attention.output.dense.weight', 'bert.encoder.layer.6.intermediate.dense.weight', 'bert.encoder.layer.9.attention.self.value.weight', 'bert.encoder.layer.10.attention.self.value.weight', 'bert.encoder.layer.6.output.dense.bias', 'bert.pooler.dense.bias', 'bert.encoder.layer.10.output.LayerNorm.bias', 'bert.encoder.layer.5.attention.output.LayerNorm.weight', 'bert.encoder.layer.9.attention.self.value.bias', 'bert.encoder.layer.3.output.LayerNorm.bias', 'bert.encoder.layer.10.attention.self.query.weight', 'bert.encoder.layer.8.output.LayerNorm.weight', 'bert.encoder.layer.6.attention.output.dense.bias', 'bert.encoder.layer.11.attention.self.value.bias', 'bert.encoder.layer.11.attention.self.value.weight', 'bert.encoder.layer.11.output.LayerNorm.bias', 'bert.encoder.layer.8.attention.self.value.weight', 'bert.encoder.layer.7.attention.self.key.weight', 'bert.encoder.layer.4.output.dense.bias', 'bert.encoder.layer.10.intermediate.dense.bias', 'bert.encoder.layer.11.output.dense.bias', 'bert.encoder.layer.8.attention.output.LayerNorm.bias', 'bert.encoder.layer.10.output.LayerNorm.weight', 'bert.encoder.layer.9.output.LayerNorm.bias', 'bert.encoder.layer.10.output.dense.bias', 'cls.predictions.bias', 'bert.encoder.layer.5.output.LayerNorm.bias', 'bert.encoder.layer.6.attention.self.key.bias', 'bert.encoder.layer.10.intermediate.dense.weight', 'bert.encoder.layer.3.attention.self.value.bias', 'bert.encoder.layer.4.attention.output.dense.bias', 'bert.encoder.layer.7.attention.self.key.bias', 'bert.encoder.layer.6.attention.output.dense.weight', 'bert.encoder.layer.9.attention.self.key.bias', 'bert.encoder.layer.11.attention.output.LayerNorm.weight', 'bert.encoder.layer.7.output.LayerNorm.bias', 'cls.seq_relationship.weight', 'bert.encoder.layer.5.attention.output.LayerNorm.bias', 'bert.encoder.layer.9.attention.self.query.bias', 'bert.encoder.layer.4.attention.self.value.weight', 'bert.encoder.layer.3.attention.output.LayerNorm.bias', 'bert.encoder.layer.5.output.dense.bias', 'bert.encoder.layer.11.attention.output.LayerNorm.bias', 'bert.encoder.layer.5.output.dense.weight', 'bert.encoder.layer.10.attention.self.key.bias', 'bert.encoder.layer.3.output.LayerNorm.weight', 'bert.encoder.layer.7.attention.self.value.bias', 'bert.encoder.layer.10.attention.self.value.bias', 'bert.encoder.layer.8.attention.self.query.weight', 'bert.encoder.layer.5.intermediate.dense.bias', 'bert.encoder.layer.8.intermediate.dense.weight', 'bert.encoder.layer.4.attention.output.LayerNorm.bias', 'bert.encoder.layer.5.attention.output.dense.weight', 'bert.encoder.layer.8.output.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'bert.encoder.layer.6.attention.output.LayerNorm.weight', 'bert.encoder.layer.6.attention.self.key.weight', 'bert.encoder.layer.8.attention.self.key.weight', 'bert.encoder.layer.9.output.dense.weight', 'bert.encoder.layer.6.attention.self.value.weight', 'bert.encoder.layer.11.output.LayerNorm.weight', 'bert.encoder.layer.4.attention.self.key.weight']

Traceback (most recent call last): File "/home/root1/anaconda3/envs/mmf/bin/mmf_run", line 8, in sys.exit(run()) File "/media/root1/2f4dfbfa-d286-46c0-bdd8-c0caec6858d9/hzw/mmf/mmf_cli/run.py", line 129, in run nprocs=config.distributed.world_size, File "/home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes while not context.join(): File "/home/root1/anaconda3/envs/mmf/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 108, in join (error_index, name) Exception: process 0 terminated with signal SIGKILL

I followed some tips to increase the memory to 64Gb, set the num_worker to 0, and reduce the batch_size to 64, but it still doesn't work.Kindly help me to resolve this issue.

Zhang-Henry commented 1 year ago

The same problem. Have you solved it?