marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.
MIT License
1.79k stars 135 forks source link

CLBlast support for gpt-2 types (WizardCoder)? #28

Closed richardr1126 closed 1 year ago

richardr1126 commented 1 year ago

There is CLBlast GPU support for GPT-2 based models on koboldcpp for example, where I can do prompt processing on the GPU VRAM for less prompt batching errors with my 16GB of CPU RAM. Does anyone know if this is possible with ctransformers?

marella commented 1 year ago

Currently it is doesn't have GPU support for those models as it is based on the examples from ggml which don't have GPU support. Only LLaMA and Falcon models have GPU (CUDA) support currently.