huggingface / Google-Cloud-Containers

Hugging Face Deep Learning Containers (DLCs) for Google Cloud
https://hf.co/docs/google-cloud
Apache License 2.0
127 stars 16 forks source link

Add `examples/cloud-run` on preview #82

Closed alvarobartt closed 1 month ago

alvarobartt commented 2 months ago

Description

This PR adds an example for Google Cloud Run, as GPU support for NVIDIA L4 was recently included (see [announcement]()), but is still under preview.

The example showcases how to setup Cloud Run and how to deploy Meta Llama 3.1 8B AWQ (INT4) from the Hugging Face Hub via the latest Hugging Face DLC for Text Generation Inference (TGI).

Closes #75