This is a super simple c++/cuda implementation of rwkv with no pytorch/libtorch dependencies.
included is a simple example of how to use in both c++ and python.
1) go to the actions tab 2) find a green checkmark for your platform 3) download the executable 4) download or convert a model (downloads here) 5) place the model.bin file in the same place as the executable 6) run the executable
In the top of the source directory
mkdir build
cd build
cmake ..
cmake --build . --config Release
Make sure you already installed CUDA Toolkit / HIP development tools / Vulkan development tools
# in example/storygen
build.sh # Linux/nvidia
build.bat # Windows/nvidia
amd.sh # Linux/Amd
vulkan.sh # Linux/Vulkan(all)
You can find executable at build/storygen[.exe] that can be run from the build directory. It expects a 'model.bin' file at the converter folder. See the following note on downloading and converting the RWKV 4 models.
$ cd build
$ ./storygen
You can download the weights of the model here: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main
For conversion to a .bin model you can choose between 2 options:
Make sure you have python + torch, tkinter, tqdm and Ninja packages installed.
> cd converter
> python3 convert_model.py
Make sure you have python + torch, tqdm and Ninja packages installed.
> cd converter
> python3 convert_model.py your_downloaded_model.pth
C++ tokenizer came from this project: https://github.com/gf712/gpt2-cpp/