huggingface / Google-Cloud-Containers

Hugging Face Deep Learning Containers (DLCs) for Google Cloud
https://hf.co/docs/google-cloud
Apache License 2.0
131 stars 18 forks source link

small updates to the TGI Cloud Run readme #105

Closed saraford closed 1 month ago

saraford commented 1 month ago

A couple of additional suggestions:

  1. when using cloud workstations or cloud shell, the (bash) terminal doesn't get the contents of the $ACCESS_TOKEN before sending the cURL request. The result is always Your client does not have permission to the requested URL <code>/v1/chat/completions</code>. which might be difficult to debug since the Access Token is valid. I believe using double-quotes here works fine for other shells.
  2. The minimum requirements for GPUs is 4 CPU and 16 memory. I've updated the gcloud command and clarified the description to call out how this is the minimum requirement.