harrisonvanderbyl / rwkv-cpp-accelerated

A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependencies
MIT License
303 stars 19 forks source link

Add pybind binding #19

Closed nenkoru closed 1 year ago

nenkoru commented 1 year ago

WIP, will request a review Would also add tokenizer & sample bindings

nenkoru commented 1 year ago

@harrisonvanderbyl, ready for review