feature request: add llama.cpp support for local inference like privategpt

KnowledgeCanvas / knowledge

Knowledge is a tool for saving, searching, accessing, exploring and chatting with all of your favorite websites, documents and files.

Apache License 2.0

1.31k stars 91 forks source link

feature request: add llama.cpp support for local inference like privategpt #111

Open x-legion opened 1 year ago

x-legion commented 1 year ago

for people like me who will never use closed source tool for anything.

RobRoyce commented 1 year ago

Great idea, I'll look in to it 👍

harrellbm commented 1 year ago

https://github.com/Josh-XT/AGiXT

This is another project that has multi model support. Not sure if you could lift something from here.

harrellbm commented 1 year ago

@RobRoyce Also wouldn't mind looking into helping with those integrations if you have some pointers of where to look in the code

RobRoyce commented 1 year ago

@harrellbm I would definitely appreciate the collaboration 👍

One of the biggest obstacles is not being able to (easily) run Python given the cross-platform nature of the app. I have a strong suspicion that Docker will be required to expand the model selection and feature set while still remaining local-only.

Currently the single point of entry for OpenAI is this line in the ChatController. The easiest way to test different models would be to replace this line with a different API call.

On the use of Docker: I have already started a repo that includes Apache Tika for text extraction of arbitrary local files. I will work on updating it to include a simple Python server with some LangChain functionality ASAP. That should give us a base image to work with for adding other models

harrellbm commented 1 year ago

@RobRoyce Sounds like a plan! I don't have a ton of time to code anymore with work and family but definitely want to help where I can!

Just to get my head in the right place the basic backend seems to be an electron app using express servers if I am correct? I know there is a lot more to it than that but just to get the basic idea of the architecture right in my head to start with. I have worked for a long time on a project called Superalgos. I can see there being some cool integrations between the projects but also really want to use knowledge myself!

severian42 commented 6 months ago

So I've been trying to work on this problem and think I have found a solution but am not nearly skilled enough to pull it off. There has been a new way called Llamafile (https://github.com/Mozilla-Ocho/llamafile) to serve llama.cpp llms across multiple platforms in an already wrapped server package. This got me thinking that it may be possible to add this alongside Knowledge Canvas and utilize it to make all the chat completions locally. Someone even threw together an Electron based file start (https://github.com/swkidd/react-electron-llamafile-starter/tree/main).

I've been trying to integrate myself but can't seem to pull it off as I am not really familiar with this framework and programming is not my strongest skill. Would anyone here be willing to give it a go and see if they have better success?