这是一个用纯Pytorch原生实现的RWKV大语言模型的推理框架,官方的原生实现过于复杂且无法拓展生态,让我们加入灵活的Pytorch阵营,一起开源起来吧!
This is an inference framework for the RWKV large language model implemented purely in native PyTorch. The official native implementation is overly complex and lacks extensibility. Let's join the flexible PyTorch ecosystem and open-source it together!
Features
git clone -b dev https://github.com/yuunnn-w/RWKV_Pytorch.git
cd RWKV_Pytorch
进入仓库目录,执行 pip install -r requirements.txt
安装依赖。weight
文件夹中。main.py
文件的 MODEL_NAME
参数。python main.py
,即可看到batch推理效果。
git clone -b pipeline https://github.com/yuunnn-w/RWKV_Pytorch.git
cd RWKV_Pytorch
进入仓库目录,执行 pip install -r requirements.txt
安装依赖。weight
文件夹中。train/params.json
文件的 MODEL_NAME
参数。torchrun --nproc-per-node 3 train/train-parallel.py
开始训练。Usage
git clone https://github.com/yuunnn-w/RWKV_Pytorch.git
cd RWKV_Pytorch
, then install the dependencies: pip install -r requirements.txt
.weight
directory.MODEL_NAME
parameter in the main.py
file.python main.py
to see the batch inference results.onnx_export.py
文件参数为你想导出的模型。python onnx_export.py
即可导出到./onnx路径。mkdir ONNX_Simplified
创建一个用于存放简化算子模型的目录。python simplify_large_onnx.py -m onnx/{model name}.onnx -o ONNX_Simplified/{model name}.onnx
来简化模型,简化后的模型将存放在ONNX_Simplified目录。onnx_infer.py
文件内的模型路径参数,执行 python onnx_infer.py
即可推理onnx格式模型。ONNX Export Method
onnx_export.py
file to specify the model you want to export.python onnx_export.py
to export the model to the ./onnx directory.mkdir ONNX_Simplified
.python simplify_large_onnx.py -m onnx/{model name}.onnx -o ONNX_Simplified/{model name}.onnx
. The simplified model will be stored in the ONNX_Simplified directory.onnx_infer.py
file, then run python onnx_infer.py
to perform inference on the ONNX format model.openai_api.py
文件中的模型配置参数。python openai_api.py
即可启动后端。http://127.0.0.1:8848
作为 API_URL
参数,即可体验。Local Deployment Experience
openai_api.py
file.python openai_api.py
to start the backend.http://127.0.0.1:8848
as the API_URL
parameter to experience it.Known Issues:
注意,本框架目前仅支持RWKV v6模型,具体版本号为x060
Please note that this framework currently only supports RWKV v6 models, specifically version x060.
In the future, based on this project, adaptation for the AI Pro development board launched by Xunlong Orange Pi is planned to enable inference of the domestic large language model RWKV on the Ascend ecosystem!!!
Additionally, after testing, the ONNX model exported and optimized from v6 1.6B contains the following operators:
Gather
, Count: 145Squeeze
, Count: 121ReduceMean
, Count: 148Sub
, Count: 122Mul
, Count: 484Add
, Count: 675Sqrt
, Count: 74Div
, Count: 74Shape
, Count: 240Expand
, Count: 240Range
, Count: 72Reshape
, Count: 384Equal
, Count: 72Where
, Count: 72Unsqueeze
, Count: 192Concat
, Count: 192ScatterND
, Count: 72MatMul
, Count: 337Tanh
, Count: 48Split
, Count: 24Exp
, Count: 48Neg
, Count: 24Sigmoid
, Count: 48Slice
, Count: 24Flatten
, Count: 24Relu
, Count: 24优化模型用到的仓库:onnxsim_large_model
Yuunnn_w |
WuTianyi |
Zhiyuan Li |
Null |
感谢各位大佬做出的贡献!欢迎各路大神为本项目提PR和Issue!你们的贡献对本项目十分有价值!!!
We warmly invite everyone to contribute to the project by submitting PRs and raising Issues! Your input and contributions are highly valued and play a vital role in improving the project for the entire community. Let's collaborate and make this project even better together!