huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
1.82k stars 470 forks source link

Documentation of API Client stale #2348

Closed LuisBlanche closed 1 week ago

LuisBlanche commented 1 week ago

Describe the bug

When using HfApi.create_inference_endpoint, the example states gives this configuration for instance configuration : https://github.com/huggingface/huggingface_hub/blob/4c7aa33bac0b4fb59796e76fcc9cd7f5ff0fd426/src/huggingface_hub/hf_api.py#L7214C1-L7215C48

It seems though that the API has changed and that the right arguments are now for the same machine :

instance_size="x1"
instance_type="nvidia-a10g"

Reproduction

        >>> from huggingface_hub import HfApi
        >>> api = HfApi()
        >>> create_inference_endpoint(
        ...     "aws-zephyr-7b-beta-0486",
        ...     repository="HuggingFaceH4/zephyr-7b-beta",
        ...     framework="pytorch",
        ...     task="text-generation",
        ...     accelerator="gpu",
        ...     vendor="aws",
        ...     region="us-east-1",
        ...     type="protected",
        ...     instance_size="medium",
        ...     instance_type="g5.2xlarge",
        ...     custom_image={
        ...         "health_route": "/health",
        ...         "env": {
        ...             "MAX_BATCH_PREFILL_TOKENS": "2048",
        ...             "MAX_INPUT_LENGTH": "1024",
        ...             "MAX_TOTAL_TOKENS": "1512",
        ...             "MODEL_ID": "/repository"
        ...         },
        ...         "url": "ghcr.io/huggingface/text-generation-inference:1.1.0",
        ...     },
        ... )

        ```

Logs

self.endpoint = self.create_or_update_endpoint(
  File "/Users/luis.blanche/Documents/Code/ds-uc2/uc2/common/huggingface/endpoint.py", line 173, in create_or_update_endpoint
    endpoint = create_inference_endpoint(
  File "/Users/luis.blanche/.pyenv/versions/3.10.13/envs/product/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 7264, in create_inference_endpoint
    hf_raise_for_status(response)
  File "/Users/luis.blanche/.pyenv/versions/3.10.13/envs/product/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 358, in hf_raise_for_status
    raise BadRequestError(message, response=response) from e
huggingface_hub.utils._errors.BadRequestError:  (Request ID: aSgN8f)

Bad request:
Bad Request: Instance compute 'Gpu' - 'g5.2xlarge' - 'medium' in 'aws' - 'us-east-1' not found

System info

- huggingface_hub version: 0.23.1
- Platform: macOS-14.5-arm64-arm-64bit
- Python version: 3.10.13
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /Users/luis.blanche/.cache/huggingface/token
- Has saved token ?: True
- Who am I ?: LuisBlancheMirakl
- Configured git credential helpers: osxkeychain
- FastAI: N/A
- Tensorflow: 2.13.0
- Torch: 2.1.1
- Jinja2: 3.1.2
- Graphviz: N/A
- keras: 2.13.1
- Pydot: N/A
- Pillow: 10.1.0
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.1
- pydantic: 2.6.0
- aiohttp: 3.8.5
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /Users/luis.blanche/.cache/huggingface/hub
- HF_ASSETS_CACHE: /Users/luis.blanche/.cache/huggingface/assets
- HF_TOKEN_PATH: /Users/luis.blanche/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
Wauplin commented 1 week ago

Hi @LuisBlanche, thanks for reporting! Documentation has been fixed in https://github.com/huggingface/huggingface_hub/pull/2282 and can be found on the main documentation: https://huggingface.co/docs/huggingface_hub/main/en/package_reference/hf_api#huggingface_hub.HfApi.create_inference_endpoint. It will soon be shipped in next release :)