Closed jojogh closed 4 months ago
You may read this example:
examples/fill-in-middle/main.py
thanks first, i have seen this example, it is generate mode, not in chat mode.
For example, I use the following for one of my projects:
from ollama import Options
completion = ollama.chat(
#keep_alive=0,
model=config.ollamaDefaultModel,
messages=[
*ongoingMessages,
{
"role": "user",
"content": prompt,
},
],
format="json",
stream=True,
options=Options(
temperature=0.0,
num_ctx=8192,
num_predict=-1,
),
)
and "keep_alive" as well, thanks!
For example, I use the following for one of my projects:
from ollama import Options
completion = ollama.chat( #keep_alive=0, model=config.ollamaDefaultModel, messages=[ *ongoingMessages, { "role": "user", "content": prompt, }, ], format="json", stream=True, options=Options( temperature=0.0, num_ctx=8192, num_predict=-1, ), )
Thanks Wong, I will try. Many thanks.
Is there any examples to guide how to set temprature and output token size in the chat mode?