awslabs / data-on-eks

DoEKS is a tool to build, deploy and scale Data & ML Platforms on Amazon EKS
https://awslabs.github.io/data-on-eks/
Apache License 2.0
617 stars 210 forks source link

fix: Fix Triton server model loading error #584

Closed ratnopamc closed 2 months ago

ratnopamc commented 2 months ago

What does this PR do?

One of the recent PRs introduced an error(see below) causing the model loading to fail when the triton server gets deployed.

Error Log:

I0715 04:06:56.609115 1 server.cc:674] 
+-----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model     | Version | Status                                                                                                                                                                                                        |
+-----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| llama2    | 1       | READY                                                                                                                                                                                                         |
| llama3    | 1       | UNAVAILABLE: Invalid argument: instance group llama3_0 of model llama3 specifies invalid or unsupported gpu id 2. GPUs with at least the minimum required CUDA compute compatibility of 6.000000 are: 0       |
| mistral7b | 1       | UNAVAILABLE: Invalid argument: instance group mistral7b_0 of model mistral7b specifies invalid or unsupported gpu id 1. GPUs with at least the minimum required CUDA compute compatibility of 6.000000 are: 0 |
+-----------+---------

This PR fixes the above issue.

🛑 Please open an issue first to discuss any significant work and flesh out details/direction - we would hate for your time to be wasted. Consult the CONTRIBUTING guide for submitting pull-requests.

Motivation

More

For Moderators

Additional Notes