awslabs / data-on-eks

DoEKS is a tool to build, deploy and scale Data & ML Platforms on Amazon EKS
https://awslabs.github.io/data-on-eks/
Apache License 2.0
617 stars 210 forks source link

feat: RayServe with vLLM using AWS Neuron on Amazon EKS #607

Closed ratnopamc closed 1 month ago

ratnopamc commented 1 month ago

What does this PR do?

Adds capability to deploy LLMs for inference on AWS Inferentia with ray and vLLM 🛑 Please open an issue first to discuss any significant work and flesh out details/direction - we would hate for your time to be wasted. Consult the CONTRIBUTING guide for submitting pull-requests.

Motivation

591

More

For Moderators

Additional Notes

vara-bonthu commented 1 month ago

Next Steps once this PR is merged

1/ Add HF Token to deployment yaml and config map serving script to handle gated models 2/ Website Doc for the deployment