hankcs / HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
https://hanlp.hankcs.com/
Apache License 2.0
33.84k stars 10.12k forks source link

加载ner模型报错 #1830

Closed MrChocol closed 1 year ago

MrChocol commented 1 year ago

Describe the bug 使用MSRA_NER_BERT_BASE_ZH模型调用ner,会去下载模型,但是加载路径下模型文件是存在的.

Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem.

ner = hanlp.load(hanlp.pretrained.ner.MSRA_NER_BERT_BASE_ZH)
sen_ner = tok(sentence)
print(sen_ner)

Describe the current behavior A clear and concise description of what happened.

Expected behavior A clear and concise description of what you expected to happen.

System information OS: Windows-10-10.0.22621-SP0 Python: 3.8.12 PyTorch: 2.0.1+cpu TensorFlow: 2.13.0 HanLP: 2.1.0-beta.50

Other info / logs Traceback (most recent call last): File "", line 991, in _find_and_load File "", line 975, in _find_and_load_unlocked File "", line 671, in _load_unlocked File "", line 843, in exec_module File "", line 219, in _call_with_frames_removed File "C:\PowerChen\work\workspace\Idea\python\hanlp-server\test\server\hanlp_api.py", line 18, in ner_handle = hanlp.load(hanlp.pretrained.ner.MSRA_NER_BERT_BASE_ZH) File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\hanlp__init__.py", line 43, in load return load_from_meta_file(save_dir, 'meta.json', verbose=verbose, kwargs) File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\hanlp\utils\component_util.py", line 186, in load_from_meta_file raise e from None File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\hanlp\utils\component_util.py", line 106, in load_from_meta_file obj.load(save_dir, verbose=verbose, kwargs) File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\hanlp\common\keras_component.py", line 215, in load self.build(merge_dict(self.config, training=False, logger=logger, kwargs, overwrite=True, inplace=True)) File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\hanlp\common\keras_component.py", line 225, in build self.model = self.build_model(merge_dict(self.config, training=kwargs.get('training', None), File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\hanlp\components\taggers\transformers\transformer_tagger_tf.py", line 34, in build_model model, tokenizer = build_transformer(transformer, max_seq_length, len(self.transform.tag_vocab), tagging=True) File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\hanlp\layers\transformers\loader_tf.py", line 11, in buildtransformer tokenizer = AutoTokenizer.from_pretrained(transformer) File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\hanlp\layers\transformers\pt_imports.py", line 65, in from_pretrained tokenizer = cls.from_pretrained(get_tokenizer_mirror(transformer), use_fast=use_fast, File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 658, in from_pretrained config = AutoConfig.from_pretrained( File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\transformers\models\auto\configuration_auto.py", line 944, in from_pretrained config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, kwargs) File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\transformers\configuration_utils.py", line 574, in get_config_dict config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs) File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\transformers\configuration_utils.py", line 629, in _get_config_dict resolved_config_file = cached_file( File "C:\PowerChen\work\environment\miniconda3\lib\site-packages\transformers\utils\hub.py", line 452, in cached_file raise EnvironmentError( OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like bert-base-chinese is not the path to a directory containing a file named config.json. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

hankcs commented 1 year ago
  1. 加载路径下的模型文件只是一个组件运行所需资源的一部分。文档都有写:https://hanlp.hankcs.com/docs/install.html#download-error
  2. Hugging Face管理的资源下载失败的话请按日志向对方求助:

OSError: We couldn't connect to 'https://huggingface.co/' to load this file, couldn't find it in the cached files and it looks like bert-base-chinese is not the path to a directory containing a file named config.json. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.