huggingface / Google-Cloud-Containers

Hugging Face Deep Learning Containers (DLCs) for Google Cloud
https://hf.co/docs/google-cloud
Apache License 2.0
131 stars 18 forks source link

The API `https://huggingface.co/api/integrations/tgi/v1/provider/gcp/recommend` now does not include the new models like datagemma and shieldgemma #101

Closed weigary closed 2 months ago

weigary commented 2 months ago

Hi,

Now we are making http request to this URL https://huggingface.co/api/integrations/tgi/v1/provider/gcp/recommend?model_id=google/shieldgemma-27b to get the recommended deployment configs. But we found it returns {"error":"No recommendation found"} for a number of new models. Can we refresh the model list?

Thanks!

philschmid commented 2 months ago

Hello,

Thank you for opening. We ll take a look

philschmid commented 2 months ago

We are not using the latest version and updating it. Here is the config for ShieldGemma

{
  "model_id": "google/shieldgemma-27b",
  "instance": "a2-ultragpu-1g",
  "configuration": {
    "model_id": "google/shieldgemma-27b",
    "max_batch_prefill_tokens": 8192,
    "max_input_length": 6000,
    "max_total_tokens": 6144,
    "num_shard": 1,
    "quantize": null,
    "estimated_memory_in_gigabytes": 66.96
  }
}