bigcode-project / starcoder.cpp

C++ implementation for 💫StarCoder
445 stars 36 forks source link

Interactive chat and gpu offload support #19

Closed RahulVivekNair closed 1 year ago

RahulVivekNair commented 1 year ago

Currently, llama.cpp allows us to pass -i -ins for an interactive chat session using the alpaca template and it also allows us to have gpu offloading via cuda or opencl. This would massively improve inference times. Will it be supported anytime soon as the only thing stopping starcoder from taking off is the huge barrier to entry in a way off inference times. I am very impressed with the model based on testing via the web interface(starchat).