Open gkvoelkl opened 2 months ago
a 8-bit quantized, cuda llama 3 with candle and tide server - https://medium.com/p/cecadd083aec
comparing rustformers and candle - https://medium.com/p/88e1bd4c49fe
there are various github links through out these blogs for different examples (quantized/not, cuda/not, serving/not, etc)
Hello, I think candle is very interesting. I am looking for a minimal example of a chatbot. Is there any? A monthly blog would be great. So it would be easier to start with candle.
Thanks
best regards Gerhard