simonw / llm

Access large language models from the command-line
https://llm.datasette.io
Apache License 2.0
4.81k stars 266 forks source link

Replying to an async conversation does not work #632

Closed simonw closed 1 week ago

simonw commented 1 week ago
python -m asyncio
>>> import asyncio
>>> import llm
>>> model = llm.get_async_model()
>>> model
<AsyncChat 'gpt-4o-mini'>
>>> c = model.conversation()
>>> c
AsyncConversation(model=<AsyncChat 'gpt-4o-mini'>, id='01jcpccv23391pc8w1j5nh4nnx', name=None, responses=[])
>>> await c.prompt('two jokes about a duck')
<Response prompt='two jokes about a duck' text='Sure! Here are two duck-themed jokes for you:

1. Why did the duck go to the party?
   Because he heard it was going to be a quackin' good time!

2. What do you call a clever duck?
   A wise quacker!'>
>>> r = _
>>> type(r)
<class 'llm.models.AsyncResponse'>
>>> await c.prompt("walrus")
Traceback (most recent call last):
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/asyncio/__main__.py", line 58, in runcode
    return future.result()
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "<console>", line 1, in <module>
  File "/Users/simon/Dropbox/Development/llm/llm/models.py", line 392, in _force
    async for _ in self:
  File "/Users/simon/Dropbox/Development/llm/llm/models.py", line 380, in __anext__
    chunk = await self._generator.__anext__()
  File "/Users/simon/Dropbox/Development/llm/llm/default_plugins/openai_models.py", line 495, in execute
    completion = await client.chat.completions.create(
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 1295, in create
    return await self._post(
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/openai/_base_client.py", line 1826, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/openai/_base_client.py", line 1519, in request
    return await self._request(
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/openai/_base_client.py", line 1550, in _request
    request = self._build_request(options)
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/openai/_base_client.py", line 496, in _build_request
    return self._client.build_request(  # pyright: ignore[reportUnknownMemberType]
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/httpx/_client.py", line 358, in build_request
    return Request(
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/httpx/_models.py", line 342, in __init__
    headers, stream = encode_request(
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/httpx/_content.py", line 214, in encode_request
    return encode_json(json)
  File "/Users/simon/.local/share/virtualenvs/llm-p4p8CDpq/lib/python3.10/site-packages/httpx/_content.py", line 177, in encode_json
    body = json_dumps(json).encode("utf-8")
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type coroutine is not JSON serializable
>>> 
simonw commented 1 week ago

Surprisingly this new test passes:

https://github.com/simonw/llm/blob/157b29ddebe6d17cb72dd64a612adc047a811646/tests/test_async.py#L20-L30

simonw commented 1 week ago

To paste into a python -m asyncio shell to try it out:

import llm
model = llm.get_async_model("gpt-4o-mini")
c = model.conversation()
print(await c.prompt('two jokes about a duck'))
print(await c.prompt("walrus"))
simonw commented 1 week ago

I dropped into the debugger and found the problem:

> /Users/simon/Dropbox/Development/llm/llm/default_plugins/openai_models.py(495)execute()
-> completion = await client.chat.completions.create(
(Pdb) list
490                 raise NotImplementedError("Model does not support system prompts")
491             messages = self.build_messages(prompt, conversation)
492             kwargs = self.build_kwargs(prompt, stream)
493             client = self.get_client(async_=True)
494             if stream:
495  ->             completion = await client.chat.completions.create(
496                     model=self.model_name or self.model_id,
497                     messages=messages,
498                     stream=True,
499                     **kwargs,
500                 )
(Pdb) messages
[{'role': 'user', 'content': 'two jokes about a duck'}, {'role': 'assistant', 'content': <coroutine object AsyncResponse.text at 0x309e4dd90>}, {'role': 'user', 'content': 'walrus'}]

It's here:

https://github.com/simonw/llm/blob/f90f29dec9ff10c765c6d997adb11372d5b8dfaa/llm/default_plugins/openai_models.py#L352-L355

https://github.com/simonw/llm/blob/f90f29dec9ff10c765c6d997adb11372d5b8dfaa/llm/default_plugins/openai_models.py#L372-L378

I'm calling response.text() without await there to try and reconstruct the messages array.

So that build_messages() method can't be shared between the sync and async worlds, because it needs to be an async def build_messages() method in async land.

But it has a bunch of logic that I'd rather not duplicate.

simonw commented 1 week ago

Could we assume that response has been previously awaited (and hence the text made available) by the time it's passed to that method, and raise an error if it has NOT? That might work.

simonw commented 1 week ago
diff --git a/llm/default_plugins/openai_models.py b/llm/default_plugins/openai_models.py
index 82f737c..3d9816e 100644
--- a/llm/default_plugins/openai_models.py
+++ b/llm/default_plugins/openai_models.py
@@ -375,7 +375,7 @@ class _Shared:
                     messages.append(
                         {"role": "user", "content": prev_response.prompt.prompt}
                     )
-                messages.append({"role": "assistant", "content": prev_response.text()})
+                messages.append({"role": "assistant", "content": prev_response.text_or_raise()})
         if prompt.system and prompt.system != current_system:
             messages.append({"role": "system", "content": prompt.system})
         if not prompt.attachments:
diff --git a/llm/models.py b/llm/models.py
index cb9c7ab..7b61411 100644
--- a/llm/models.py
+++ b/llm/models.py
@@ -393,6 +393,11 @@ class AsyncResponse(_BaseResponse):
                 pass
         return self

+    def text_or_raise(self) -> str:
+        if not self._done:
+            raise ValueError("Response not yet awaited")
+        return "".join(self._chunks)
+
     async def text(self) -> str:
         await self._force()
         return "".join(self._chunks)

That does seem to fix it.