marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.
MIT License
1.79k stars 136 forks source link

feature request #141

Open thistleknot opened 12 months ago

thistleknot commented 12 months ago

batch inference

thistleknot commented 12 months ago

I've been chasing gguf batch inference down, and apparently not supported in ctransformers, llama.cpp, nor llama-cpp-python

yukiarimo commented 9 months ago

Why?