kubeflow / arena

A CLI for Kubeflow.
Apache License 2.0
728 stars 178 forks source link

feat: add backend param for triton serving #1039

Closed gujingit closed 6 months ago

gujingit commented 6 months ago

Feature: support vllm、 tensorrt-llm backend for triton serving.

vllm backend

arena serve triton \
 --backend=vllm \
 --name=triton-vllm \
 --namespace=default \
 --cpu=6 \
 --memory=12Gi \
 --gpus=1 \
 --data=model-pvc:/mnt/pvc/ \
 --model-repository=/mnt/pvc/model_repository

trt-llm

arena serve triton \
 --backend=trt-llm \
 --name=triton-trt-llm \
 --namespace=default \
 --cpu=6 \
 --memory=12Gi \
 --gpus=1 \
 --data=model-pvc:/mnt/pvc/ \
 --model-repository=/mnt/pvc/model_repository
gujingit commented 6 months ago

/assign @Syulin7

Syulin7 commented 6 months ago

@gujingit Thanks! Can you submit a new PR to update the document?

Syulin7 commented 6 months ago

/lgtm /approve

google-oss-prow[bot] commented 6 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gujingit, Syulin7

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubeflow/arena/blob/master/OWNERS)~~ [Syulin7] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment