Closed uogbuji closed 1 year ago
Their docs are a bit ragged, but did some initial experimentation & poking around in their code. Here is a sample, working session using my local LLM via latest openai API.
from openai import OpenAI
client = OpenAI(api_key='dummy', base_url='http://127.0.0.1:8000/v1/')
# Just use whatever llama.cpp or whatever mounts for the model
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Say this is a test",
}
],
model="dummy",
)
print(chat_completion.choices[0].message.content)
That returns just the actual first choice message text (what used to be in the returned JSON structure).
It's great that they finally have an encapsulated session object (openai.OpenAI
) rather than the very yucky globals approach they used to follow.
It turns out there is a migration guide from upstream. I missed it this morning.
The migration guide is a helpful touch, to be fair. Unfortunately it says nothing about the thread lock objects that they've buried somewhere, which means even though they no longer use globals for everything, their main objects can't be pickled for multiprocessing. I've had to JIT create the OpenAI
objects to preserve our multiprocessing support.
OK that's all the demos updated, and the sweep included some fixes of older bugs, as well. 🎉
A massive PR landed in openai which changed the API (really the Python client). They now seem to have labeled it the "V1" API, weirdly (''. Got released on OpenAI dev day as 1.1.0, so this will break for any new installs.
API docs are updated.