I have tried to run bert with _--withcuda False, but the model keeps running "forward" function on cuda. These are my command line and the error message I got.
Loading Vocab vocab.small
Vocab Size: 262
Loading Train Dataset corpus.small
Loading Dataset: 113it [00:00, 560232.09it/s]
Loading Test Dataset None
Creating Dataloader
Building BERT model
Creating BERT Trainer
Total Parameters: 6453768
Training Start
EP_train:0: 0%|| 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/yuni/anaconda3/envs/py3/bin/bert", line 8, in
sys.exit(train())
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/main.py", line 67, in train
trainer.train(epoch)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/trainer/pretrain.py", line 69, in train
self.iteration(epoch, self.train_data)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/trainer/pretrain.py", line 102, in iteration
next_sent_output, mask_lm_output = self.model.forward(data["bert_input"], data["segment_label"])
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/language_model.py", line 24, in forward
x = self.bert(x, segment_label)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, kwargs)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/bert.py", line 46, in forward
x = transformer.forward(x, mask)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/transformer.py", line 29, in forward
x = self.input_sublayer(x, lambda _x: self.attention.forward(_x, _x, _x, mask=mask))
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/utils/sublayer.py", line 18, in forward
return x + self.dropout(sublayer(self.norm(x)))
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/transformer.py", line 29, in
x = self.input_sublayer(x, lambda _x: self.attention.forward(_x, _x, _x, mask=mask))
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/attention/multi_head.py", line 32, in forward
x, attn = self.attention(query, key, value, mask=mask, dropout=self.dropout)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/attention/single.py", line 25, in forward
return torch.matmul(p_attn, value), p_attn
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 1.95 GiB total capacity; 309.18 MiB already allocated; 125.62 MiB free; 312.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Hello,
I have tried to run bert with _--withcuda False, but the model keeps running "forward" function on cuda. These are my command line and the error message I got.
_bert -c corpus.small -v vocab.small -o bert.model --withcuda False -e 5
Loading Vocab vocab.small Vocab Size: 262 Loading Train Dataset corpus.small Loading Dataset: 113it [00:00, 560232.09it/s] Loading Test Dataset None Creating Dataloader Building BERT model Creating BERT Trainer Total Parameters: 6453768 Training Start EP_train:0: 0%|| 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/yuni/anaconda3/envs/py3/bin/bert", line 8, in
sys.exit(train())
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/main.py", line 67, in train
trainer.train(epoch)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/trainer/pretrain.py", line 69, in train
self.iteration(epoch, self.train_data)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/trainer/pretrain.py", line 102, in iteration
next_sent_output, mask_lm_output = self.model.forward(data["bert_input"], data["segment_label"])
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/language_model.py", line 24, in forward
x = self.bert(x, segment_label)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, kwargs)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/bert.py", line 46, in forward
x = transformer.forward(x, mask)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/transformer.py", line 29, in forward
x = self.input_sublayer(x, lambda _x: self.attention.forward(_x, _x, _x, mask=mask))
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/utils/sublayer.py", line 18, in forward
return x + self.dropout(sublayer(self.norm(x)))
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/transformer.py", line 29, in
x = self.input_sublayer(x, lambda _x: self.attention.forward(_x, _x, _x, mask=mask))
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/attention/multi_head.py", line 32, in forward
x, attn = self.attention(query, key, value, mask=mask, dropout=self.dropout)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call( input, kwargs)
File "/home/yuni/anaconda3/envs/py3/lib/python3.6/site-packages/bert_pytorch/model/attention/single.py", line 25, in forward
return torch.matmul(p_attn, value), p_attn
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 1.95 GiB total capacity; 309.18 MiB already allocated; 125.62 MiB free; 312.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF