C++支持直接加载DeepSeek V2 Lite系列的HF模型

ztxz16 / fastllm

纯c++的全平台llm加速库，支持python调用，chatglm-6B级模型单卡可达10000+token / s，支持glm, llama, moss基座，手机端流畅运行

Apache License 2.0

3.33k stars 341 forks source link

Closed TylunasLi closed 4 months ago

TylunasLi commented 4 months ago

修改了Jinja模板的解析，支持布尔变量“true”和“false”，支持[Jinja模板支持"is defined"和“elif”，现在支持了DeepSeek V2系列模板，因此可以直接读取DeepSeek V2 Lite系列模型。（V2 236B 太大了，没法测试）

在以下模型上测试过 deepseek-ai/DeepSeek-V2-Lite-Chat deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct