Open ZTMIDGO opened 1 year ago
You can save a quantized model as such:
cd path/to/model/folder/
python3 -m rwkvstic --pq
You will be prompted to select a model.
After conversion, you will have a model.pqth file in the folder. You can load this model by calling RWKV("/path/to/model.pqth")
Thanks, how to use cpu to execute python3 -m rwkvstic --pq
if you dont have the ability to use your gpu for the conversion, it may be possible to hide your cuda devices using enviroment variables
if you dont have the ability to use your gpu for the conversion, it may be possible to hide your cuda devices using enviroment variables
Thanks, I have successfully converted, but I still have some questions:
from rwkvstic.load import RWKV from rwkvstic.agnostic.backends import ONNX_EXPORT import torch
model = RWKV("model.pth", backend=ONNX_EXPORT, dtype=torch.float16)
I want to convert the quantized PTH to ONNX, but there is no code for saving the model ONNX
hey, please upgrade your rwkvstic version, the newest version will now export 2 files, model.onnx, and model.bin, before ending. you can then load up the model.onnx using rwkvstic, or another runtime
Understand, thank you very much
I tried running the exported onnx model using the onnx runtime of the mobile platform, but it doesn't work, but if I export onnx using https://github.com/AXKuhta/RWKV-LM/tree/onnx, it can run on the mobile platform, unfortunately it is not possible to export the model above 1B5.
Looks like this https://github.com/josephrocca/rwkv-v4-web
@ZTMIDGO You may need to alter the program to flatten the feeds, as the rwkvstic onnx expects a flattened shape
import torch from rwkvstic.load import RWKV from rwkvstic.agnostic.backends import TORCH_QUANT
runtime_dtype = torch.float32 # torch.float64, torch.bfloat16
chunksize = 4
useGPU = False # False
target = 4
model = RWKV("model/RWKV-4-Pile-1B5-EngChn-testNovel-671-ctx2048-20230216.pth", backend=TORCH_QUANT, runtimedtype=runtime_dtype, chunksize=chunksize, useGPU=useGPU, target=target)
torch.save({'model': model.state_dict()}, './model_name.pth')
error: AttributeError: 'RWKVMaster' object has no attribute 'state_dict'
How to save the quantized model?