PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
11.71k stars 2.86k forks source link

[Bug]: 'NoneType' object has no attribute 'from_pretrained' -超长文本分类任务训练报错 #8621

Open jidechao opened 1 week ago

jidechao commented 1 week ago

软件环境

- paddlepaddle:
- paddlepaddle-gpu: 
- paddlenlp:

重复问题

错误描述

[2024-06-18 17:39:15,317] [    INFO] - tokenizer config file saved in /root/.paddlenlp/models/ernie-doc-base-zh/tokenizer_config.json
[2024-06-18 17:39:15,317] [    INFO] - Special tokens file saved in /root/.paddlenlp/models/ernie-doc-base-zh/special_tokens_map.json
Traceback (most recent call last):
  File "/data/llms/github_code/PaddleNLP-release-2.8/examples/text_classification/ernie_doc/train.py", line 345, in <module>
    do_train(args)
  File "/data/llms/github_code/PaddleNLP-release-2.8/examples/text_classification/ernie_doc/train.py", line 181, in do_train
    model = ErnieDocForSequenceClassification.from_pretrained(args.model_name_or_path, num_classes=num_classes)
  File "/data/llms/github_code/PaddleNLP-release-2.8/paddlenlp/transformers/model_utils.py", line 2087, in from_pretrained
    config, model_kwargs = cls.config_class.from_pretrained(
AttributeError: 'NoneType' object has no attribute 'from_pretrained'

稳定复现步骤 & 代码

python train.py --batch_size 4 \
--model_name_or_path ernie-doc-base-zh \
--epoch 5 \
--output_dir ./checkpoints/