ollama / ollama-python

Ollama Python library
https://ollama.com
MIT License
2.71k stars 223 forks source link

Add in simple to use iterative chat (chat with history) #63

Closed connor-makowski closed 4 months ago

connor-makowski commented 4 months ago

For example a function like:

from ollama import historic_chat

message = {
  'role': 'user',
  'content': 'Tell me a Joke in less than 30 words.',
}
response = historic_chat('mistral', message=message)
print(response['message']['content'])

message = {
  'role': 'user',
  'content': 'Another please.',
}
response = historic_chat('mistral', message=message)
print(response['message']['content'])

Should return two jokes.

connor-makowski commented 4 months ago

@mxyng tagging you to get some initial thoughts on this issue and PR.

mxyng commented 4 months ago

Thanks for the interest in ollama-python. The current intention for this library is to mirror the Ollama API and is minimal as a result. Implementing memory as part of the library is out of scope.

Memory can be implemented easily by manipulating the messages keyword argument. Here's a quick example:

from ollama import chat

messages = [
  {'role': 'user', 'content': 'Tell me a Joke in less than 30 words.'},
]

response = chat('mistral', messages=messages)
message = response['message']
print(message['content'])
messages.append(message)

messages.append({'role': 'user', 'content': 'Another please.'})
response = chat('mistral', messages=messages)
message = response['message']
print(message['content'])
messages.append(message)

This example will print two jokes. Here's what mistral gave me

$ python run example.py
 Why don't scientists trust atoms? Because they make up everything!
 Why did the scarecrow win an award? Because he was outstanding in his field!

This full example provides a working version of the above with a few extra bells and whistles. It's effectively a trimmed down version of ollama run (with --speak for fun).

connor-makowski commented 4 months ago

I am not sure if you saw the PR submitted to solve this. Its very simple and seems worthwhile to add to scope.https://github.com/ollama/ollama-python/pull/64

mxyng commented 4 months ago

I did see the PR. The PR changes the synchronous client to subclass a new client with internal state and adds a method iterative_chat which uses the internal state in place of messages.

A stateful client is problematic since now there now needs to be functions to manage the internal state such as trimming or resetting state. This adds additional scope and complexity to library for features that are user specific.

Additionally, using a single client for two sets of history is impossible since this client maintains a single history list.

The optional history keyword argument functions roughly the same as the current messages keyword argument by manipulating the internal state. However it's error prone since sending the same history twice will add duplicate entries.

Overall, it's a lot of effort and code to implement something the user can implement easily and with more precision

connor-makowski commented 4 months ago

Fair points. I still see a large need for this. I will probably just end up creating an external library that streamlines this process.

mxyng commented 4 months ago

That would be great! This library is intended to be a building block on which others such as yourself can build more featureful projects