huggingface / Google-Cloud-Containers

Including Hugging Face Deep learning Containers for Google Cloud
Apache License 2.0
112 stars 10 forks source link

Port `deploy-llama-3-1-405-on-vertex-ai/` example #87

Closed alvarobartt closed 6 days ago

alvarobartt commented 6 days ago

Description

This PR ports the example on how to deploy Meta Llama 3.1 405B Instruct FP8 developed for this blog post, and previously hosted in alvarobartt/meta-llama-3-1-on-vertex-ai.

The example now follows the same formatting as the rest of the Vertex AI example Jupyter Notebooks within this repository, keeping the section for the quota increase request, adding some disclaimer messages when needed (deployment time, gated access, or MESSAGES_API_ENABLED upcoming support).