Open harrisonvanderbyl opened 1 year ago
https://github.com/harrisonvanderbyl/rwkv-cpp-cuda
Includes c++ sampler and tokenizer, as well as 8bit cuda support, with no libtorch dependency
I'm considering supporting gpu inference, could this project support packing cuda dll dependency to release file? so that user can run the app without install cuda toolkit.
The project links cudart_static, it should run without cuda being installed
got it, I'll run the benchmark between libtorch implementation and your cuda implementation.
https://github.com/harrisonvanderbyl/rwkv-cpp-cuda
Includes c++ sampler and tokenizer, as well as 8bit cuda support, with no libtorch dependency