huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
2.03k stars 532 forks source link

`TypeError: string indices must be integers` when iterating over models from `list_models` #1817

Closed ademait closed 11 months ago

ademait commented 11 months ago

Describe the bug

Hi, It seems it is an internal error. When I iterate over the whole list (now retrieved as Python Iterators), it fails at some point and throws a TypeError exception.

Also, it outputs Invalid model-index. Not loading eval results into CardData. for some models, which it isn't an exception I guess. I've seen this issue from nateraw/modelcards archived repo. What is the meaning of this warning? It appeared 978 times before crashing (978 from 358114 iterated models, value from i in the reproducible code).

Reproduction

from huggingface_hub import HfApi
api = HfApi()
models = api.list_models(full=True, cardData=True, fetch_config=True, sort="lastModified", direction=-1)
i = 0
for d in models:
    i +=1
print(i)

This code is just to iterate over all models.

Logs

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/test_api/testAPI.ipynb Cell 16 line 2
      1 i = 0
----> 2 for d in models:
      3     i +=1
      4 print(i)

File ~/anaconda3/envs/hf/lib/python3.9/site-packages/huggingface_hub/hf_api.py:1328, in HfApi.list_models(self, filter, author, search, emissions_thresholds, sort, direction, limit, full, cardData, fetch_config, token)
   1326 if "siblings" not in item:
   1327     item["siblings"] = None
-> 1328 model_info = ModelInfo(**item)
   1329 if emissions_thresholds is None or _is_emission_within_treshold(model_info, *emissions_thresholds):
   1330     yield model_info

File ~/anaconda3/envs/hf/lib/python3.9/site-packages/huggingface_hub/hf_api.py:527, in ModelInfo.__init__(self, **kwargs)
    524 self.mask_token = kwargs.pop("mask_token", None)
    525 card_data = kwargs.pop("cardData", None) or kwargs.pop("card_data", None)
    526 self.card_data = (
--> 527     ModelCardData(**card_data, ignore_metadata_errors=True) if isinstance(card_data, dict) else card_data
    528 )
    530 self.widget_data = kwargs.pop("widget_data", None)
    531 self.model_index = kwargs.pop("model-index", None) or kwargs.pop("model_index", None)

File ~/anaconda3/envs/hf/lib/python3.9/site-packages/huggingface_hub/repocard_data.py:305, in ModelCardData.__init__(self, language, license, library_name, tags, datasets, metrics, eval_results, model_name, ignore_metadata_errors, **kwargs)
...
--> 552     name = elem["name"]
    553     results = elem["results"]
    554     for result in results:

TypeError: string indices must be integers

System info

- huggingface_hub version: 0.19.0
- Platform: Linux-6.2.16-4-pve-x86_64-with-glibc2.28
- Python version: 3.9.12
- Running in iPython ?: Yes
- iPython shell: ZMQInteractiveShell
- Running in notebook ?: Yes
- Running in Google Colab ?: No
- Token path ?: /root/.cache/huggingface/token
- Has saved token ?: False
- Configured git credential helpers: 
- FastAI: N/A
- Tensorflow: N/A
- Torch: N/A
- Jinja2: 3.1.2
- Graphviz: N/A
- Pydot: N/A
- Pillow: N/A
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: N/A
- pydantic: N/A
- aiohttp: N/A
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /root/.cache/huggingface/hub
- HF_ASSETS_CACHE: /root/.cache/huggingface/assets
- HF_TOKEN_PATH: /root/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
ademait commented 11 months ago

I checked some things. It doesn't crash on a specific repo. When sorting (sort="lastModified", direction=-1) it crashes with pidanr/pegasus-bbcnews repo, and when not sorting with Waynehillsdev/Wayne_NLP_mT5.

I guess it has to do with the evaluation results values, but I'm not seeing the failure point.

Wauplin commented 11 months ago

Hey @ademait , thanks for reporting! And thanks for investigating exactly which models were failing, it helped a lot moving forward to fix the issue! I just opened a PR (see https://github.com/huggingface/huggingface_hub/pull/1821) that fixes it. Once merged, I'll make a hot-fix release as this was working in v0.18.0.

Wauplin commented 11 months ago

PR is merged. I shipped a hot-fix release (v0.19.1) with the fix: https://github.com/huggingface/huggingface_hub/releases/tag/v0.19.1 Thanks again for the report!