Closed erhant closed 3 months ago
Have you tried using async_stream with yield
. There are some usage patterns in the codebase already such as this and this. If you have issues with threading you can use flume which can be found here.
Have you tried using async_stream with
yield
. There are some usage patterns in the codebase already such as this and this. If you have issues with threading you can use flume which can be found here.
Looking into it, thanks!
I mapped the stream item using map_ok
and the error with map_err
, and returned resulting stream with Ok(Box::pin(stream))
and it works quite nicely!
Just one more question, what should be the contents of value
of StreamData
? I'm looking at the codebase but its still not clear to me, should I perhaps return the things returned by Ollama other than the message content as a Value
?
EDIT: Below is the stream
function in its final form right now:
async fn stream(
&self,
messages: &[Message],
) -> Result<Pin<Box<dyn Stream<Item = Result<StreamData, LLMError>> + Send>>, LLMError> {
let request = self.generate_request(messages);
let result = self.client.send_chat_messages_stream(request).await?;
let stream = result.map(|data| match data {
Ok(data) => match data.message {
Some(message) => Ok(StreamData::new(
serde_json::to_value(message.clone()).unwrap_or_default(),
message.content,
)),
None => Err(LLMError::ContentNotFound(
"No message in response".to_string(),
)),
},
Err(_) => Err(OllamaError::from("Stream error".to_string()).into()),
});
Ok(Box::pin(stream))
}
Value is the raw json response from the server in case the user wants to access other properties from json. Most folks will only care about the content.
I'm wondering if we Item
should be Result<Option<StreamData>, LLMError>>
instead of Result<StreamData, LLMError>
. I had mentioned my concerns https://github.com/Abraxas-365/langchain-rust/issues/140. Else based on the LLM provider we need to have a different error. Other option is to have the same error for all LLMs but since we are ending the call anyway I think Option is lot easier to work with.
EDITED*
Yea I think returning Option
instead would be better DX instead of error handling. I guess we keep this PR as is w.r.t to the returned item, and tackle that issue in a separate PR?
I will update the code to return the entire response as Value
👍🏻
I'm ok changing to Option in a different PR.
Updated, also added a TODO note for that Option
.
I have no idea why the build is failing btw :o
I have no idea why the build is failing btw :o
Don't know why this happens, I have to delete the actions cache once in a while and it works again
Merged. Thanks!
Thanks for the merge! Just saw the messages, was on road today.
I plan on handling the function call PR when Ollama-rs is updated as well, and maybe we may have to change a few lines on token count calculation implemented in this PR for Ollama, nothing big though 🙏🏻
This PR integrates Ollama-rs and feature-gates it with the
ollama
feature.Re-Implements
OllamaEmbedder
using the new client, theembedding_ollama.rs
example is working.Implements the
language_models::llm::LLM
trait forOllama
, thellm_ollama.rs
example is working.Also allows OpenAI compatible Ollama usage by implementing
async_openai::config::Config
forOllamaConfig
, this part is not feature-gated as it does not require Ollama-rs package.Closes #148 : Initially we had talked about adding an option to auto-pull a model if it does not exist and such, but with Ollama-rs integrated, we can simply leave that to the user. The pull code is a simple one-liner using the Ollama-rs client, and our tools request the client be created from the outer scope and passed in using
Arc
. So if one needs to pull a model, they can do it before passing in the model with their own logic (e.g. with retries, timeouts, cancellations)Opens up the way for #149 as we will basically have access to function calls when the PR: https://github.com/pepperoni21/ollama-rs/pull/51 is merged!