RWKV / rwkv.cpp

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
MIT License
1.4k stars 92 forks source link

Adding rwkv_eval_array operation #49

Closed PicoCreator closed 1 year ago

PicoCreator commented 1 year ago

Adding a varient of rwkv_eval as rwkv_eval_array for array operations.

This is useful for X language bindings, where we can eval a larger context, without switching back and forth between X language and C lang context. (I am currently working on a nodejs binding)

Subsequently, if you do add support for "transformer" mode, this should use the "transformer" mode (i dun see the point to doing so though, for CPU eval)

saharNooby commented 1 year ago

Please add test case for this new function into tests/test_tiny_rwkv.c. Probably, void test_model(...) can be extended with a boolean arg whether to use regular eval or eval_sequence.

saharNooby commented 1 year ago

It would also be useful to add corresponding method to rwkv/rwkv_cpp_shared_library.py, but this is not a blocker for merge -- until we have really optimized sequence processing mode, overhead when calling C from Python side in a loop seems insignificant.

LoganDark commented 1 year ago

This is no longer necessary after #89