Closed qianxliu closed 4 years ago
10/03/2020 21:40:14 - INFO - root - Loading features from cached file E:/nlp/EasyBert/ner/datasets/cluener/cached_span-train_bert-base_128_cluener
10/03/2020 21:40:16 - INFO - root - ***** Running training *****
10/03/2020 21:40:16 - INFO - root - Num examples = 10748
10/03/2020 21:40:16 - INFO - root - Num Epochs = 4
10/03/2020 21:40:16 - INFO - root - Instantaneous batch size per GPU = 24
10/03/2020 21:40:16 - INFO - root - Total train batch size (w. parallel, distributed & accumulation) = 24
10/03/2020 21:40:16 - INFO - root - Gradient Accumulation steps = 1
10/03/2020 21:40:16 - INFO - root - Total optimization steps = 1792
Traceback (most recent call last):
File "run_ner_span.py", line 507, in <module>
main()
File "run_ner_span.py", line 448, in main
global_step, tr_loss = train(args, train_dataset, model, tokenizer)
File "run_ner_span.py", line 141, in train
outputs = model(**inputs)
File "E:\Tools\Miniconda\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\nlp\EasyBert\ner\models\bert_for_ner.py", line 115, in forward
active_loss = attention_mask.view(-1) == 1
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
@Neteraxe 多卡的话 ,需要先attention_mask.contiguous().view(-1) == 1
谢谢,已训练成功
用4核CPU训练报以上错误,不知道怎么解决 @lonePatient