urchade / GLiNER

Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
https://arxiv.org/abs/2311.08526
Apache License 2.0
1.46k stars 126 forks source link

GLiNER.from_pretrained raises OSError: config.json file not found in Hugging Face model repository #173

Open pirchi1 opened 3 months ago

pirchi1 commented 3 months ago

I encountered an issue when trying to load the urchade/gliner_large-v1 model using the GLiNER.from_pretrained method. The process fails with an OSError, indicating that the config.json file is not found in the expected location within the Hugging Face model repository.

Steps to Reproduce: 1. Install the required dependencies: torch>=2.0.0 transformers>=4.38.2 huggingface_hub>=0.21.4 onnxruntime sentencepiece

  1. Run the following code in a Python script: from gliner import GLiNER

model_path = "urchade/gliner_large-v1" model = GLiNER.from_pretrained(model_path, load_tokenizer=True).to("cuda:0")

The Error trace: Fetching 4 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 68759.08it/s] Traceback (most recent call last): File "", line 1, in File "/opt/conda/envs/whisper_ner/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(*args, kwargs) File "/opt/conda/envs/whisper_ner/lib/python3.10/site-packages/huggingface_hub/hub_mixin.py", line 569, in from_pretrained instance = cls._from_pretrained( File "/home/ec2-user/GLiNER/gliner/model.py", line 543, in _from_pretrained tokenizer = AutoTokenizer.from_pretrained(model_dir) File "/opt/conda/envs/whisper_ner/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 853, in from_pretrained config = AutoConfig.from_pretrained( File "/opt/conda/envs/whisper_ner/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 972, in from_pretrained config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, kwargs) File "/opt/conda/envs/whisper_ner/lib/python3.10/site-packages/transformers/configuration_utils.py", line 632, in get_config_dict config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs) File "/opt/conda/envs/whisper_ner/lib/python3.10/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict resolved_config_file = cached_file( File "/opt/conda/envs/whisper_ner/lib/python3.10/site-packages/transformers/utils/hub.py", line 373, in cached_file raise EnvironmentError( OSError: /home/ec2-user/.cache/huggingface/hub/models--urchade--gliner_large-v1/snapshots/1f55b526b24c7576857d4eb2b047cc77b0143594 does not appear to have a file named config.json. Checkout 'https://huggingface.co//home/ec2-user/.cache/huggingface/hub/models--urchade--gliner_large-v1/snapshots/1f55b526b24c7576857d4eb2b047cc77b0143594/tree/None' for available files.

hari-ag00 commented 3 months ago

try removing the load_tokenizer=True