OrionStarAI / Orion

Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型,包括对话模型,长文本模型,量化模型,RAG微调模型,Agent微调模型等。
Apache License 2.0
785 stars 57 forks source link

What is the technique used to extend the context size to 200,000 tokens? #2

Open aburkov opened 9 months ago

shihanmax commented 9 months ago

+1,请问预训练/sft阶段用到的最大上下文长度是多少,外推方式是?

chenxingphh commented 9 months ago

No description provided.

Thanks for your attention. We used a longer context for pre-training as well as some existing extrapolation methods.