Closed sorenmat closed 1 month ago
It looks like you're encountering a FailedPrecondition error while trying to deploy the LLaMA model using VLLM. This typically indicates that the model server is having issues starting up or executing properly.
Check Model Logs: As suggested in the error message, look at the model server logs for any specific error messages that might indicate what went wrong.
Model Compatibility: Ensure that the model you're trying to deploy is compatible with the environment and the GPU you're using (NVIDIA L4 in this case). Check if any specific dependencies or configurations are required for LLaMA 3.
Memory and Resource Allocation: Verify that the g2-standard-12
instance has enough memory and resources allocated. Sometimes, insufficient resources can cause the server to fail during initialization.
Thanks for reporting the issue. Please see the suggestions above.
Expected Behavior
To be able to deploy the example notebook without modifications
Actual Behavior
All the links to logs in the output shows an empty log :|
Steps to Reproduce the Problem
Specifications