Hello,
I am very interested in your research and am currently trying to run some experiments based on it. However, I encountered an issue while running the program from the HuggingFace_EncDec directory after loading the BERT model, and I would greatly appreciate your assistance.
Below is the script I defined according to your specifications:
Unfortunately, running the model gives me the following error:
[INFO|trainer.py:1279] 2024-11-03 23:25:02,211 >> Running training
[INFO|trainer.py:1280] 2024-11-03 23:25:02,211 >> Num examples = 96552
[INFO|trainer.py:1281] 2024-11-03 23:25:02,211 >> Num Epochs = 30
[INFO|trainer.py:1282] 2024-11-03 23:25:02,211 >> Instantaneous batch size per device = 4
[INFO|trainer.py:1283] 2024-11-03 23:25:02,211 >> Total train batch size (w. parallel, distributed & accumulation) = 4
[INFO|trainer.py:1284] 2024-11-03 23:25:02,211 >> Gradient Accumulation steps = 1
[INFO|trainer.py:1285] 2024-11-03 23:25:02,211 >> Total optimization steps = 724140
0%| | 0/724140 [00:00<?, ?it/s]Traceback (most recent call last):
File "run_hf_enc_dec_train.py", line 632, in
main()
File "run_hf_enc_dec_train.py", line 551, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/trainer.py", line 1400, in train
tr_loss_step = self.training_step(model, inputs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/trainer.py", line 1984, in training_step
loss = self.compute_loss(model, inputs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/trainer.py", line 2016, in compute_loss
outputs = model(inputs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/encoder_decoder/modeling_encoder_decoder.py", line 489, in forward
encoder_outputs = self.encoder(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 996, in forward
encoder_outputs = self.encoder(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 585, in forward
layer_outputs = layer_module(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 472, in forward
self_attention_outputs = self.attention(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 402, in forward
self_outputs = self.self(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 268, in forward
mixed_query_layer = self.query(hidden_states)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, **kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/functional.py", line 1848, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)
0%| | 0/724140 [00:00<?, ?it/s
Hello, I am very interested in your research and am currently trying to run some experiments based on it. However, I encountered an issue while running the program from the HuggingFace_EncDec directory after loading the BERT model, and I would greatly appreciate your assistance. Below is the script I defined according to your specifications:
!/usr/bin/env bash
model_path=/root/autodl-tmp/models/models--google-bert--bert-base-uncased/snapshots/86b5e0934494bd15c9632b12f734a8a67f723594 save_model_path=/root/PISA-main/PISA-main/SG_Pretrained_BERT/ predict_results_path=G_Pretrained_BERT_pred/
python3 run_hf_enc_dec_train.py \ --model_name_or_path $model_path\ --do_train \ --seed=88 \ --save_total_limit=1 \ --train_file /root/PISA-main/PISA-main/Dataset/PISA-prompt/SG/train.json \ --validation_file /root/PISA-main/PISA-main/Dataset/PISA-prompt/SG/val.json \ --output_dir $save_model_path \ --rouge_path dummy_path \ --per_device_train_batch_size=4 \ --overwrite_output_dir \ --predict_with_generate \ --num_train_epochs 30 \ --max_source_length 1024 \ --max_target_length 128 \ --learning_rate 3e-5
Unfortunately, running the model gives me the following error: [INFO|trainer.py:1279] 2024-11-03 23:25:02,211 >> Running training [INFO|trainer.py:1280] 2024-11-03 23:25:02,211 >> Num examples = 96552 [INFO|trainer.py:1281] 2024-11-03 23:25:02,211 >> Num Epochs = 30 [INFO|trainer.py:1282] 2024-11-03 23:25:02,211 >> Instantaneous batch size per device = 4 [INFO|trainer.py:1283] 2024-11-03 23:25:02,211 >> Total train batch size (w. parallel, distributed & accumulation) = 4 [INFO|trainer.py:1284] 2024-11-03 23:25:02,211 >> Gradient Accumulation steps = 1 [INFO|trainer.py:1285] 2024-11-03 23:25:02,211 >> Total optimization steps = 724140 0%| | 0/724140 [00:00<?, ?it/s]Traceback (most recent call last): File "run_hf_enc_dec_train.py", line 632, in
main()
File "run_hf_enc_dec_train.py", line 551, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/trainer.py", line 1400, in train
tr_loss_step = self.training_step(model, inputs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/trainer.py", line 1984, in training_step
loss = self.compute_loss(model, inputs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/trainer.py", line 2016, in compute_loss
outputs = model(inputs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/encoder_decoder/modeling_encoder_decoder.py", line 489, in forward
encoder_outputs = self.encoder(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 996, in forward
encoder_outputs = self.encoder(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 585, in forward
layer_outputs = layer_module(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 472, in forward
self_attention_outputs = self.attention(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 402, in forward
self_outputs = self.self(
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 268, in forward
mixed_query_layer = self.query(hidden_states)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, **kwargs)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
File "/root/miniconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/functional.py", line 1848, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling
cublasCreate(handle)
0%| | 0/724140 [00:00<?, ?it/s