iflytek / cino

CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)
http://cino.hfl-rc.com
Apache License 2.0
212 stars 28 forks source link

error with TCNN example #27

Open anbo724 opened 2 years ago

anbo724 commented 2 years ago

Some weights of XLMRobertaModel were not initialized from the model checkpoint at model/ and are newly initialized: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "tncc_finetune.py", line 175, in main() File "tncc_finetune.py", line 172, in main trainer.run_finetune() File "tncc_finetune.py", line 155, in run_finetune self.train(model, train_loader, dev_loader, optimizer, schedule) File "tncc_finetune.py", line 120, in train loss.backward() File "/data/anbo/anaconda3/envs/transformer/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/data/anbo/anaconda3/envs/transformer/lib/python3.7/site-packages/torch/autograd/init.py", line 132, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: CUDA error: device-side assert triggered /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes failed.

torch等版本的要求也是一致的: sacremoses==0.0.53 scikit-learn==0.24.2 scipy==1.7.3 sentencepiece==0.1.97 six @ file:///tmp/build/80754af9/six_1644875935023/work threadpoolctl==3.1.0 tokenizers==0.8.1rc2 torch==1.7.1 torchaudio==0.12.1 torchvision==0.13.1 tqdm==4.64.1 transformers==3.1.0

请问怎么处理?

GeekDream-x commented 2 years ago

您好,这是由于数据集中类别标签不是从0开始导致的与模型预测分类空间范围不一致,已做调整,请使用最新example #28 @anbo724