Some weights of XLMRobertaModel were not initialized from the model checkpoint at model/ and are newly initialized: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "tncc_finetune.py", line 175, in
main()
File "tncc_finetune.py", line 172, in main
trainer.run_finetune()
File "tncc_finetune.py", line 155, in run_finetune
self.train(model, train_loader, dev_loader, optimizer, schedule)
File "tncc_finetune.py", line 120, in train
loss.backward()
File "/data/anbo/anaconda3/envs/transformer/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/data/anbo/anaconda3/envs/transformer/lib/python3.7/site-packages/torch/autograd/init.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA error: device-side assert triggered
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes failed.
Some weights of XLMRobertaModel were not initialized from the model checkpoint at model/ and are newly initialized: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "tncc_finetune.py", line 175, in
main()
File "tncc_finetune.py", line 172, in main
trainer.run_finetune()
File "tncc_finetune.py", line 155, in run_finetune
self.train(model, train_loader, dev_loader, optimizer, schedule)
File "tncc_finetune.py", line 120, in train
loss.backward()
File "/data/anbo/anaconda3/envs/transformer/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/data/anbo/anaconda3/envs/transformer/lib/python3.7/site-packages/torch/autograd/init.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA error: device-side assert triggered
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [1,0,0] Assertion
t >= 0 && t < n_classes
failed.torch等版本的要求也是一致的: sacremoses==0.0.53 scikit-learn==0.24.2 scipy==1.7.3 sentencepiece==0.1.97 six @ file:///tmp/build/80754af9/six_1644875935023/work threadpoolctl==3.1.0 tokenizers==0.8.1rc2 torch==1.7.1 torchaudio==0.12.1 torchvision==0.13.1 tqdm==4.64.1 transformers==3.1.0
请问怎么处理?