ericmjl / llamabot

Pythonic class-based interface to LLMs
https://ericmjl.github.io/llamabot/
106 stars 19 forks source link

Local model via Ollama #7

Closed 5183nischal closed 10 months ago

5183nischal commented 10 months ago

Hi @ericmjl , thank you for making this! I'd sent a PR but it's a little beyond me but I was wondering if there is a simple way ti use local models such as Ollama? I think LLamaIndex supports and I wonder if it can be easily incorporated here? I am particularly interested in the zotero bot with local model.

Thanks!

ericmjl commented 10 months ago

Hi @5183nischal! Thanks for chiming in. I just tried Ollama, and it's cool to be able to get up and running locally so easily!

I think there can be a path to supporting Ollama. The only complication I can see here is undoing a few places where I wrote code against the OpenAI models specifically. Let me mull over this for a moment, but if you'd be open to helping me by investigating the places where code would need to be changed, I would really appreciate it!

ericmjl commented 10 months ago

@5183nischal the latest release of llamabot has support for Ollama models. It's not yet as magical an experience as I'd like, but it gets the ball rolling. Would you be open to trying it out?

bdkech commented 10 months ago

Hi @ericmjl, thanks for the effort in this direction. For a variety of applications LLaMA, while not as performant, has some added benefits. I can confirm the SimpleBot works against a docker-hosted OLLAMA endpoint.

Are you expecting the codebase to be completely decoupled from OpenAI models? Reviewing the QueryBot it still relies on it : https://github.com/ericmjl/llamabot/blob/1f090b8f1188a85aa8fbc7226687840dd9eed17b/llamabot/bot/querybot.py#L263

Using the following from the examples (changing the index to file paths, and llama2 model)

from llamabot import QueryBot 

bot = QueryBot(system_message="You are an older brother teaching a sibling how to play games.", doc_paths=file_paths, model_name='llama2')
result = bot("How do I throw Scorpions spear?", similarity_top_k=5)
display(Markdown(result.response))

Yields the following error:

ValidationError: 1 validation error for ChatOpenAI root Did not find openai_api_key, please add an environment variable OPENAI_API_KEY which contains it, or pass openai_api_key as a named parameter. (type=value_error)

Would the fix be to change that call to use ChatOpenAI to a create_model call, similar to what is in the init? https://github.com/ericmjl/llamabot/blob/1f090b8f1188a85aa8fbc7226687840dd9eed17b/llamabot/bot/querybot.py#L100 I'm not as familiar with the OpenAI API, so I'm not sure if there is a 'special sauce' needed in that call.

ericmjl commented 10 months ago

Hi @bdkech! You found a great bug - this was an oversight of mine, I had duplicated the code for creating a langchain service context, which resulted in that bug you saw.

Are you expecting the codebase to be completely decoupled from OpenAI models? Reviewing the QueryBot it still relies on it

I just cut a new release which further decouples LlamaBot's python architecture from OpenAI's API. Thank you so much for helping me pinpoint the issue by listing code directly from the repo! That helped me to save a ton of time in debugging.

Would the fix be to change that call to use ChatOpenAI to a create_model call, similar to what is in the init?

That was a great hypothesis! Though in the end, when I re-reviewed my own code, the change turned out to be this change, which was to add a service context argument to the make_or_load_vector_index function. Regardless, though, your hypothesis definitely helped me a ton here!

I'm not as familiar with the OpenAI API, so I'm not sure if there is a 'special sauce' needed in that call.

Perhaps a meta-level reflection in response -- I'm now thankful we're not relying on the raw OpenAI API but instead on LangChain's abstractions, because it would have been more complicated if I had to figure out how to interface with, say, OpenAI's raw API and HuggingFace models' Python API, which are different. LangChain's ChatOpenAI and ChatOllama interfaces are nearly identical (differing by only model vs. model_name.


Thanks again for pointing out the bug! And please try out the latest release (https://github.com/ericmjl/llamabot/releases/tag/v0.0.86) -- pip install -U llamabot==0.0.86!

bdkech commented 10 months ago

I can confirm updating now allows llama2 works, thanks for the blazing fast turnaround! I'll hopefully be exploring the query bot functionality a bit more.