Would GPT4All integration provide a performance improvement?

leon-ai / leon

🧠 Leon is your open-source personal assistant.

https://getleon.ai

MIT License

14.95k stars 1.23k forks source link

Would GPT4All integration provide a performance improvement? #529

Open loren-osborn opened 3 weeks ago

loren-osborn commented 3 weeks ago

In the demos I’ve seen of Leon AI, it appeared rather slow. I have no idea if this was a limitation of the hardware or there were inefficiencies that might be improved upon. GPT4All appears to be rather performant, even on systems without CUDA compatible GPUs. I have no idea if it is any faster than the inference engine you’re already using.

louistiti commented 3 weeks ago

Which demos are you referring to? If it's about the former new voice video, then it's because I don't show the tokens being generated for most of the video. But you can see it from here. Also, it is possible to disable the LLM and use the built-in text classification which is nearly real time.

loren-osborn commented 3 weeks ago

While I based my recommendation on the performance I saw in this video: https://youtu.be/6CInSt6pTVA?si=oIipaG4Rb07EqSet I know many local LLM inference and training systems rely heavily on Nvdia CUDA GPUs. I mentioned GPT4All as I knew it leveraged AVX CPU instructions and Nomic Vulkan to provide efficient access to LLM inference on Nvidia and AMD GPUs. I’m not sure if Leon currently relies on CUDA for performance, but if so, GPT4All may help you support more hardware.