ollama-instructor
is a lightweight Python library that provides a convenient wrapper around the Client of the renowned Ollama repository, extending it with validation features for obtaining valid JSON responses from a Large Language Model (LLM). Utilizing Pydantic, ollama-instructor
allows users to specify models for JSON schemas and data validation, ensuring that responses from LLMs adhere to the defined schema.
Note 1: This library has a native support for the Ollamas Python client. If you want to have more flexibility with other providers like Groq, OpenAI, Perplexity and more, have a look into the great library of instrutor of Jason Lui.
Note 2: This library depends on having Ollama installed and running. For more information, please refer to the official website of Ollama.
allow_partial
flag to True. This will try to clean set invalid data within the response and set it to None
. Unsetted data (not part of the Pydantic model) will be deleted from the response.format
to '' instead to 'json' (default) the LLM can return a string with a step by step reasoning. The LLM is instructed to return the JSON response within a code block (json ...
) which can be extracted from ollama-instructor (see example).ollama-instructor
can help you to get structured and reliable JSON from local LLMs like:
ollama-instructor
can be your starting point to build agents by your self. Have full control over agent flows without relying on complex agent framework.
Find more here: The concept of ollama-instructor
To install ollama-instructor
, run the following command in your terminal:
pip install ollama-instructor
Here are quick examples to get you started with ollama-instructor
:
chat completion:
from ollama_instructor.ollama_instructor_client import OllamaInstructorClient
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
client = OllamaInstructorClient(...)
response = client.chat_completion(
model='phi3',
pydantic_model=Person,
messages=[
{
'role': 'user',
'content': 'Jason is 30 years old.'
}
]
)
print(response['message']['content'])
Output:
{"name": "Jason", "age": 30}
asynchronous chat completion:
from pydantic import BaseModel, ConfigDict
from enum import Enum
from typing import List
import rich
import asyncio
from ollama_instructor.ollama_instructor_client import OllamaInstructorAsyncClient
class Gender(Enum):
MALE = 'male'
FEMALE = 'female'
class Person(BaseModel):
'''
This model defines a person.
'''
name: str
age: int
gender: Gender
friends: List[str] = []
model_config = ConfigDict(
extra='forbid'
)
async def main():
client = OllamaInstructorAsyncClient(...)
await client.async_init() # Important: must call this before using the client
response = await client.chat_completion(
model='phi3:instruct',
pydantic_model=Person,
messages=[
{
'role': 'user',
'content': 'Jason is 25 years old. Jason loves to play soccer with his friends Nick and Gabriel. His favorite food is pizza.'
}
],
)
rich.print(response['message']['content'])
if __name__ == "__main__":
asyncio.run(main())
chat completion with streaming:
(!) Currently broken due to dependency issues with new version of ollama
(!)
from ollama_instructor.ollama_instructor_client import OllamaInstructorClient
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
client = OllamaInstructorClient(...)
response = client.chat_completion_with_stream(
model='phi3',
pydantic_model=Person,
messages=[
{
'role': 'user',
'content': 'Jason is 30 years old.'
}
]
)
for chunk in response:
print(chunk['message']['content'])
The classes OllamaInstructorClient
and OllamaInstructorAsyncClient
are the main class of the ollama-instructor
library. They are the wrapper around the Ollama
(async) client and contain the following arguments:
host
: the URL of the Ollama server (default: http://localhost:11434
). See documentation of Ollamadebug
: a bool
indicating whether to print debug messages (default: False
).Note: Until versions (
v0.4.2
) I was working withicecream
for debugging. I switched to thelogging
module.
The chat_completion
and chat_completion_with_stream
methods are the main methods of the library. They are used to generate text completions from a given prompt.
ollama-instructor
uses chat_completion
and chat_completion_with_stream
to expand the chat
method of Ollama
. For all available arguments of chat
see the Ollama documentation.
The following arguments are added to the chat
method within chat_completion
and chat_completion_with_stream
:
pydantic_model
: a class of Pydantic's BaseModel
class that is used to firstly instruct the LLM with the JSON schema of the BaseModel
and secondly to validate the response of the LLM with the built-in validation of Pydantic.retries
: the number of retries if the LLM fails to generate a valid response (default: 3
). If a LLM fails the retry will provide the last response of the LLM with the given ValidationError
and insructs it to generate a valid response.allow_partial
: If set to True
ollama-instructor
will modify the BaseModel
to allow partial responses. In this case it makes sure to provide the correct instance of the JSON schema but with default or None values. Therefore, it is useful to provide default values within the BaseModel
. With the improvement of this library you will find examples and best practice guides on that topic in the docs folder.format
: In fact this is an argument of Ollama
already. But since version 0.4.0
of ollama-instructor
this can be set to 'json'
or ''
. By default ollama-instructor
uses the 'json'
format. Before verion 0.4.0
only 'json'
was possible. But within chat_completion
(NOT for chat_completion_with_stream
) you can set format
= ''
to enable the reasoning capabilities. The default system prompt of ollama-instructor
instructs the LLM properly to response in a json ...
code block, to extract the JSON for validation. When coming with a own system prompt an setting format
= ''
, this has to be considered. See an example here.ollama-instructor
is released under the MIT License. See the LICENSE file for more details.
If you need help or want to discuss ollama-instructor
, feel free to open an issue, a discussion on GitHub or just drop me an email (lennartpollvogt@protonmail.com).
I always welcome new ideas of use cases for LLMs and vision models, and would love to cover them in the examples folder. Feel free to discuss them with me via email, issue or discussion section of this repository. 😊