vLLM support? - Githubissues

OrionStarAI / Orion

Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型，包括对话模型，长文本模型，量化模型，RAG微调模型，Agent微调模型等。

Apache License 2.0

785 stars 57 forks source link

vLLM support? #16

Open lhl opened 9 months ago

lhl commented 9 months ago

The docs mention that you used vLLM for inferencing, but it looks like Orion support hasn't been upstreamed yet: https://github.com/vllm-project/vllm/tree/main/vllm/model_executor/models

Can you share the model file or do you have an ETA for upstreaming the code? HF transformers inferencing is slow enough to make Orion pretty unusable even for running evals.

ZeroYuJie commented 9 months ago

I've been using the Orion branch from https://github.com/dachengai/vllm and it's running, but there might be issues with outputs in different languages

shuiqingliu commented 9 months ago

I've been using the Orion branch from https://github.com/dachengai/vllm and it's running, but there might be issues with outputs in different languages

Yeap ,I am trying to translate from Chinese to English, but the output still contains Chinese characters. 😭