huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
2.04k stars 533 forks source link

KeyError: 'safe' when calling HfFileSystem.ls() #2523

Closed LiteralGenie closed 1 month ago

LiteralGenie commented 1 month ago

Describe the bug

Calling HfFileSystem().ls(some_repo_id) throws a KeyError with all repositories I've tried. This wasn't happening yesterday and switching to an older huggingface_hub version does not seem to help.

The object throwing the KeyError seems to come from the security field in this API response: https://huggingface.co/api/models/tiiuae/falcon-7b-instruct/tree/main?expand=True (Code doesn't seem to be expecting everything to be wrapped in an hf field)

As a workaround, changing this line in hf_api.py seems to fix things.

# before
safe=security["safe"], av_scan=security["avScan"], pickle_import_scan=security["pickleImportScan"]

# after
safe=security['hf']["safe"], av_scan=security['hf']["avScan"], pickle_import_scan=security['hf']["pickleImportScan"]

Reproduction

# test.py
from huggingface_hub import HfFileSystem

id_repo = "tiiuae/falcon-7b-instruct"
files = HfFileSystem().ls(id_repo)
python3 -m venv venv
./venv/bin/python -m pip install huggingface-hub
./venv/bin/python test.py

Logs

Traceback (most recent call last):
  File "/tmp/test.py", line 4, in <module>
    files = HfFileSystem().ls(id_repo)
  File "/tmp/venv/lib/python3.10/site-packages/huggingface_hub/hf_file_system.py", line 292, in ls
    out = self._ls_tree(path, refresh=refresh, revision=revision, **kwargs)
  File "/tmp/venv/lib/python3.10/site-packages/huggingface_hub/hf_file_system.py", line 383, in _ls_tree
    for path_info in tree:
  File "/tmp/venv/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2913, in list_repo_tree
    yield (RepoFile(**path_info) if path_info["type"] == "file" else RepoFolder(**path_info))
  File "/tmp/venv/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 638, in __init__
    safe=security["safe"], av_scan=security["avScan"], pickle_import_scan=security["pickleImportScan"]
KeyError: 'safe'

Response from https://huggingface.co/api/models/tiiuae/falcon-7b-instruct/tree/main?expand=True

[
    {
        "type": "directory",
        "oid": "8c72a9a6bbb748da3b33f372c24c4c89fb87c5d6",
        "size": 0,
        "path": "coreml",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        }
    },
    {
        "type": "directory",
        "oid": "49c706563552e42944d982668ea573f7c72b4abe",
        "size": 0,
        "path": "coreml/text-generation",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        }
    },
    {
        "type": "directory",
        "oid": "e41c2c2d6eab4eb55fe8c9a775aeb63f7e080b77",
        "size": 0,
        "path": "coreml/text-generation/falcon-7b-64-float32.mlpackage",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        }
    },
    {
        "type": "directory",
        "oid": "f30e81adebc25f38b8a1d2a1787f92df47a858e0",
        "size": 0,
        "path": "coreml/text-generation/falcon-7b-64-float32.mlpackage/Data",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        }
    },
    {
        "type": "directory",
        "oid": "6b378bbd9619260945dec11fb7a42ea378ca9e67",
        "size": 0,
        "path": "coreml/text-generation/falcon-7b-64-float32.mlpackage/Data/com.apple.CoreML",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        }
    },
    {
        "type": "directory",
        "oid": "25b3009c1920af56e3397dfb6918496eab0a313c",
        "size": 0,
        "path": "coreml/text-generation/falcon-7b-64-float32.mlpackage/Data/com.apple.CoreML/weights",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        }
    },
    {
        "type": "file",
        "oid": "c7d9f3332a950355d5a77d85000f05e6f45435ea",
        "size": 1477,
        "path": ".gitattributes",
        "lastCommit": {
            "id": "7972440d80cedb03e4d9a0bc81e79e47979e8aff",
            "title": "initial commit",
            "date": "2023-04-25T06:21:01.000Z"
        },
        "security": {
            "hf": {
                "blobId": "c7d9f3332a950355d5a77d85000f05e6f45435ea",
                "name": ".gitattributes",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "8f3a985412d0a985c7f00396e8047c971ad4f5a4",
        "size": 9821,
        "path": "README.md",
        "lastCommit": {
            "id": "eb410fb6ffa9028e97adb801f0d6ec46d02f8b07",
            "title": "Revert in-library PR (#63)",
            "date": "2023-07-13T13:52:22.000Z"
        },
        "security": {
            "hf": {
                "blobId": "8f3a985412d0a985c7f00396e8047c971ad4f5a4",
                "name": "README.md",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "84d8843072cbc300692c6bccff5b9c08c430498e",
        "size": 1048,
        "path": "config.json",
        "lastCommit": {
            "id": "cf4b3c42ce2fdfe24f753f0f0d179202fea59c99",
            "title": "Move to in-libary checkpoint (for real this time) (#88)",
            "date": "2023-09-29T14:32:23.000Z"
        },
        "security": {
            "hf": {
                "blobId": "84d8843072cbc300692c6bccff5b9c08c430498e",
                "name": "config.json",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "def8c2bf7e0f85be115be9e6a79dd3c5aa50a99d",
        "size": 7163,
        "path": "configuration_falcon.py",
        "lastCommit": {
            "id": "cf4b3c42ce2fdfe24f753f0f0d179202fea59c99",
            "title": "Move to in-libary checkpoint (for real this time) (#88)",
            "date": "2023-09-29T14:32:23.000Z"
        },
        "security": {
            "hf": {
                "blobId": "def8c2bf7e0f85be115be9e6a79dd3c5aa50a99d",
                "name": "configuration_falcon.py",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "78f6438b82d40644bb5195d2aefee91ad5d5ff41",
        "size": 396524,
        "lfs": {
            "oid": "b12b1d5cab8d237975a831477e3cf5997eef5e932636a0654ef1695b04eb9412",
            "size": 396524,
            "pointerSize": 131
        },
        "path": "coreml/text-generation/falcon-7b-64-float32.mlpackage/Data/com.apple.CoreML/model.mlmodel",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        },
        "security": {
            "hf": {
                "blobId": "78f6438b82d40644bb5195d2aefee91ad5d5ff41",
                "name": "model.mlmodel",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "65145e28d9d7ff9c89ca18cf8ac50b9005db9251",
        "size": 27693883200,
        "lfs": {
            "oid": "bc5be03ba082315bb432dbe51f3f80f9d6b1ee1ca05c40d072bad8b0f91c4f0f",
            "size": 27693883200,
            "pointerSize": 136
        },
        "path": "coreml/text-generation/falcon-7b-64-float32.mlpackage/Data/com.apple.CoreML/weights/weight.bin",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        },
        "security": {
            "hf": {
                "blobId": "65145e28d9d7ff9c89ca18cf8ac50b9005db9251",
                "name": "coreml/text-generation/falcon-7b-64-float32.mlpackage/Data/com.apple.CoreML/weights/weight.bin",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "d65a448c77eec0559db9db013a44eca0c9b0d795",
        "size": 617,
        "path": "coreml/text-generation/falcon-7b-64-float32.mlpackage/Manifest.json",
        "lastCommit": {
            "id": "983c105219393d8678ead83d5eddbc040ba4d976",
            "title": "coreml-weights (#8)",
            "date": "2023-05-30T06:14:13.000Z"
        },
        "security": {
            "hf": {
                "blobId": "d65a448c77eec0559db9db013a44eca0c9b0d795",
                "name": "Manifest.json",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "02b145e38790e52c2161b8d5ed97ee967bc3307e",
        "size": 117,
        "path": "generation_config.json",
        "lastCommit": {
            "id": "cf4b3c42ce2fdfe24f753f0f0d179202fea59c99",
            "title": "Move to in-libary checkpoint (for real this time) (#88)",
            "date": "2023-09-29T14:32:23.000Z"
        },
        "security": {
            "hf": {
                "blobId": "02b145e38790e52c2161b8d5ed97ee967bc3307e",
                "name": "generation_config.json",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "70c059b13f10f3e20d34a31174dd7da53cfb93ad",
        "size": 1154,
        "path": "handler.py",
        "lastCommit": {
            "id": "7f5eb0f424a861cab1ce0c8a4cff1c264b3ceae9",
            "title": "Add hf endpoint handler.py (#24)",
            "date": "2023-06-05T10:59:47.000Z"
        },
        "security": {
            "hf": {
                "blobId": "70c059b13f10f3e20d34a31174dd7da53cfb93ad",
                "name": "handler.py",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "834822cbe81585262b727c3fdbe520a34fd24ad4",
        "size": 56920,
        "path": "modeling_falcon.py",
        "lastCommit": {
            "id": "cf4b3c42ce2fdfe24f753f0f0d179202fea59c99",
            "title": "Move to in-libary checkpoint (for real this time) (#88)",
            "date": "2023-09-29T14:32:23.000Z"
        },
        "security": {
            "hf": {
                "blobId": "834822cbe81585262b727c3fdbe520a34fd24ad4",
                "name": "modeling_falcon.py",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "fc373b9b20a661af0d2e3671ade2b963f5a5fef5",
        "size": 9951028193,
        "lfs": {
            "oid": "66acf4bebb68593952a51575cb02dbf258a606e236c6b82b6b60c3b1e9089e66",
            "size": 9951028193,
            "pointerSize": 135
        },
        "path": "pytorch_model-00001-of-00002.bin",
        "lastCommit": {
            "id": "06337a928f7443faa9957b52ed4c46113654130c",
            "title": "Upload RWForCausalLM",
            "date": "2023-04-25T06:24:28.000Z"
        },
        "security": {
            "hf": {
                "blobId": "fc373b9b20a661af0d2e3671ade2b963f5a5fef5",
                "name": "pytorch_model-00001-of-00002.bin",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": {
                    "highestSafetyLevel": "innocuous",
                    "imports": [
                        {
                            "module": "torch._utils",
                            "name": "_rebuild_tensor_v2",
                            "safety": "innocuous"
                        },
                        {
                            "module": "collections",
                            "name": "OrderedDict",
                            "safety": "innocuous"
                        },
                        {
                            "module": "torch",
                            "name": "BFloat16Storage",
                            "safety": "innocuous"
                        }
                    ]
                }
            }
        }
    },
    {
        "type": "file",
        "oid": "ed32c6ba8d523d31ccfc88975912ba7ca7f8af17",
        "size": 4483421659,
        "lfs": {
            "oid": "1de823c84b1c8b9889ac2a6c670ec6002a71776abd42cdf51bb3acd4c9938b29",
            "size": 4483421659,
            "pointerSize": 135
        },
        "path": "pytorch_model-00002-of-00002.bin",
        "lastCommit": {
            "id": "06337a928f7443faa9957b52ed4c46113654130c",
            "title": "Upload RWForCausalLM",
            "date": "2023-04-25T06:24:28.000Z"
        },
        "security": {
            "hf": {
                "blobId": "ed32c6ba8d523d31ccfc88975912ba7ca7f8af17",
                "name": "pytorch_model-00002-of-00002.bin",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": {
                    "highestSafetyLevel": "innocuous",
                    "imports": [
                        {
                            "module": "torch._utils",
                            "name": "_rebuild_tensor_v2",
                            "safety": "innocuous"
                        },
                        {
                            "module": "torch",
                            "name": "BFloat16Storage",
                            "safety": "innocuous"
                        },
                        {
                            "module": "collections",
                            "name": "OrderedDict",
                            "safety": "innocuous"
                        }
                    ]
                }
            }
        }
    },
    {
        "type": "file",
        "oid": "1d92decce70fb4d5e840694c6947d2abfef27fdf",
        "size": 16924,
        "path": "pytorch_model.bin.index.json",
        "lastCommit": {
            "id": "06337a928f7443faa9957b52ed4c46113654130c",
            "title": "Upload RWForCausalLM",
            "date": "2023-04-25T06:24:28.000Z"
        },
        "security": {
            "hf": {
                "blobId": "1d92decce70fb4d5e840694c6947d2abfef27fdf",
                "name": "pytorch_model.bin.index.json",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "24f43d813328da380b3d684c019f9c6d84df6b50",
        "size": 281,
        "path": "special_tokens_map.json",
        "lastCommit": {
            "id": "bacaa8a883d7eced821ab877344eaa840f39004f",
            "title": "Upload tokenizer",
            "date": "2023-05-24T05:36:55.000Z"
        },
        "security": {
            "hf": {
                "blobId": "24f43d813328da380b3d684c019f9c6d84df6b50",
                "name": "special_tokens_map.json",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "24f2d2e20d26ae4f0729da0d008e0ee6d81fc560",
        "size": 2734130,
        "path": "tokenizer.json",
        "lastCommit": {
            "id": "bacaa8a883d7eced821ab877344eaa840f39004f",
            "title": "Upload tokenizer",
            "date": "2023-05-24T05:36:55.000Z"
        },
        "security": {
            "hf": {
                "blobId": "24f2d2e20d26ae4f0729da0d008e0ee6d81fc560",
                "name": "tokenizer.json",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    },
    {
        "type": "file",
        "oid": "4aa644a0eca5b539ec8703d62d4b957c74a54963",
        "size": 287,
        "path": "tokenizer_config.json",
        "lastCommit": {
            "id": "cf4b3c42ce2fdfe24f753f0f0d179202fea59c99",
            "title": "Move to in-libary checkpoint (for real this time) (#88)",
            "date": "2023-09-29T14:32:23.000Z"
        },
        "security": {
            "hf": {
                "blobId": "4aa644a0eca5b539ec8703d62d4b957c74a54963",
                "name": "tokenizer_config.json",
                "safe": true,
                "indexed": false,
                "avScan": {
                    "virusFound": false,
                    "virusNames": null
                },
                "pickleImportScan": null
            }
        }
    }
]

System info

- huggingface_hub version: 0.24.6
- Platform: Linux-5.15.0-119-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /home/amy/.cache/huggingface/token
- Has saved token ?: False
- Configured git credential helpers: 
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.3.0
- Jinja2: 3.1.4
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: 9.0.1
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.24.1
- pydantic: N/A
- aiohttp: 3.10.3
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /home/amy/.cache/huggingface/hub
- HF_ASSETS_CACHE: /home/amy/.cache/huggingface/assets
- HF_TOKEN_PATH: /home/amy/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
binarycrayon commented 1 month ago

I just added a similar bug, looks like they deployed a breaking change

XciD commented 1 month ago

Rollback done.

Thanks for your report