OrionStarAI / Orion

Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型,包括对话模型,长文本模型,量化模型,RAG微调模型,Agent微调模型等。
Apache License 2.0
785 stars 57 forks source link

显存大小询问 #37

Open caramel678 opened 9 months ago

caramel678 commented 9 months ago

请问Orion-14B-Chat-RAG需要多大显存可以实现推理呢?

chenxingphh commented 9 months ago

感谢关注。加载Orion-14B-Chat-RAG大概需要28G的显存,推理的话需要按照长度来计算。建议使用vLLM来进行推理