Closed purnasanyal closed 3 months ago
As the error in the log says, you have to accept the terms and conditions on hugging face for the models.
I0806 21:52:45.863589 1 pb_stub.cc:366] "Failed to initialize Python stub: OSError: You are trying to access a gated repo.\nMake sure to have access to it at https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.\n403 Client Error. (Request ID: Root=1-66b29b2d-360bea0d76518b604809e08c;c38d10d6-674e-466b-9a56-a7b3f7c07a4e)\n\nCannot access gated repo for url https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json.\nAccess to model meta-llama/Meta-Llama-3-8B-Instruct is restricted and you are not in the authorized list. Visit https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct to ask for access.
Description
Hi, I am creating a demo for “Deploying Multiple Large Language Models with NVIDIA Triton Server and vLLM” from my Isengard account using Cloud9. However, nvidia-triton-server-triton-inference-server-54546fdb86-wh7tb pod is crashing. Attached is the pod log.
I think – I have access to Huggingface for Llama and Mistral models.
Terraform v1.9.3
pod.log