Ollama-rs integration - Githubissues

erhant commented 3 months ago

This PR integrates Ollama-rs and feature-gates it with the ollama feature.

Re-Implements OllamaEmbedder using the new client, the embedding_ollama.rs example is working.
Implements the language_models::llm::LLM trait for Ollama, the llm_ollama.rs example is working.
Also allows OpenAI compatible Ollama usage by implementing async_openai::config::Config for OllamaConfig, this part is not feature-gated as it does not require Ollama-rs package.
Closes #148 : Initially we had talked about adding an option to auto-pull a model if it does not exist and such, but with Ollama-rs integrated, we can simply leave that to the user. The pull code is a simple one-liner using the Ollama-rs client, and our tools request the client be created from the outer scope and passed in using Arc. So if one needs to pull a model, they can do it before passing in the model with their own logic (e.g. with retries, timeouts, cancellations)
Opens up the way for #149 as we will basically have access to function calls when the PR: https://github.com/pepperoni21/ollama-rs/pull/51 is merged!

[!NOTE]

@prabirshrestha Im currently opening this as a draft, because I couldn't implement the stream function for the LLM trait, and would like your opinion / help on that part! Ollama-rs has a streaming call itself, and I just have to map the items of this stream to the one that LLM trait wants, but for some reason I couldn't do it yet.

I will still be working on that, but would love your input if any 🙏🏻

EDIT: Streaming is done.

prabirshrestha commented 3 months ago

Have you tried using async_stream with yield. There are some usage patterns in the codebase already such as this and this. If you have issues with threading you can use flume which can be found here.

erhant commented 3 months ago

Have you tried using async_stream with yield. There are some usage patterns in the codebase already such as this and this. If you have issues with threading you can use flume which can be found here.

Looking into it, thanks!

erhant commented 3 months ago

I mapped the stream item using map_ok and the error with map_err, and returned resulting stream with Ok(Box::pin(stream)) and it works quite nicely!

Just one more question, what should be the contents of value of StreamData? I'm looking at the codebase but its still not clear to me, should I perhaps return the things returned by Ollama other than the message content as a Value?

EDIT: Below is the stream function in its final form right now:

async fn stream(
        &self,
        messages: &[Message],
    ) -> Result<Pin<Box<dyn Stream<Item = Result<StreamData, LLMError>> + Send>>, LLMError> {
        let request = self.generate_request(messages);
        let result = self.client.send_chat_messages_stream(request).await?;

        let stream = result.map(|data| match data {
            Ok(data) => match data.message {
                Some(message) => Ok(StreamData::new(
                    serde_json::to_value(message.clone()).unwrap_or_default(),
                    message.content,
                )),
                None => Err(LLMError::ContentNotFound(
                    "No message in response".to_string(),
                )),
            },
            Err(_) => Err(OllamaError::from("Stream error".to_string()).into()),
        });

        Ok(Box::pin(stream))
    }

prabirshrestha commented 3 months ago

Value is the raw json response from the server in case the user wants to access other properties from json. Most folks will only care about the content.

I'm wondering if we Item should be Result<Option<StreamData>, LLMError>> instead of Result<StreamData, LLMError>. I had mentioned my concerns https://github.com/Abraxas-365/langchain-rust/issues/140. Else based on the LLM provider we need to have a different error. Other option is to have the same error for all LLMs but since we are ending the call anyway I think Option is lot easier to work with.

erhant commented 3 months ago

EDITED*

Yea I think returning Option instead would be better DX instead of error handling. I guess we keep this PR as is w.r.t to the returned item, and tackle that issue in a separate PR?

I will update the code to return the entire response as Value 👍🏻

prabirshrestha commented 3 months ago

I'm ok changing to Option in a different PR.

erhant commented 3 months ago

Updated, also added a TODO note for that Option.

erhant commented 3 months ago

I have no idea why the build is failing btw :o

Abraxas-365 commented 3 months ago

I have no idea why the build is failing btw :o

Don't know why this happens, I have to delete the actions cache once in a while and it works again

prabirshrestha commented 3 months ago

Merged. Thanks!

erhant commented 3 months ago

Thanks for the merge! Just saw the messages, was on road today.

I plan on handling the function call PR when Ollama-rs is updated as well, and maybe we may have to change a few lines on token count calculation implemented in this PR for Ollama, nothing big though 🙏🏻

Abraxas-365 / langchain-rust

Ollama-rs integration #156