RWKV / rwkv.cpp

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
MIT License
1.37k stars 90 forks source link

Allow creating multiple contexts per model #83

Closed LoganDark closed 1 year ago

LoganDark commented 1 year ago

This allows for parallel inference and I am preparing to support sequence mode using a method similar to this (creating a new kind of graph)

Yes it should be perfectly safe to mix multiple ggml contexts like this, I looked into the code and there is no requirement that graphs only have to be created from the current context :3

saharNooby commented 1 year ago

Looks almost good, but I think a separate test sohuld be added -- see comment above.

LoganDark commented 1 year ago

I see what is going on, one sec

saharNooby commented 1 year ago

@LoganDark LGTM now, but why it is closed?