RWKV / rwkv.cpp

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
MIT License
1.37k stars 90 forks source link

Make sampling with bias numerically stable #90

Closed Jmkernes closed 1 year ago

Jmkernes commented 1 year ago

Remove a slow for loop on logit bias. Make the numpy re-softmax operation numerically stable.