Open simonw opened 1 year ago
The model can be selected once at the top of the page, so the select box should be separate from the textarea and button.
I mocked up some initial HTML and CSS for this a while ago:
Code here:
The moment you start a new conversation and get assigned a conversation ID the URL will update to /-/llm/01h7p0846m5hqbsp063zrne6ec
- hitting that URL directly will load that prior conversation.
In terms of the API... I think I'll setup a Server-Sent Events endpoint for each conversation at /-/llm/api/01h7p0846m5hqbsp063zrne6ec
which the page can subscribe to for streams of LLM tokens.
To send a message, JavaScript will send a POST to that same endpoint. To keep things simple I'll have the user-submitted text only show up after it has been reflected in the SSE stream.
I'm going to imitate the ChatGPT trick of setting the name of the conversation automatically after the first few messages, and letting the user edit it if they like.
Maybe the edit option will show some other suggestions? Might be neat.
Also added one of my pelican photos as an initial avatar image:
The trickiest part of this implementation is going to be bridging the for token in generate(...)
mechanism from LLM with the need to serve these things async
in Datasette.
I'm going to try running the LLM generation in a thread and communicating back up to the async
code via some kind of queue.
I got a prototype of that working in https://github.com/simonw/llm/commit/14efa506f57043ef6f34f4194dd427017a0d251e
Full code here: https://github.com/simonw/llm/blob/web/llm/web/datasette-plugins/llm_views.py
My previous prototype used WebSockets, but I want to try Server-Sent-Events this time.
Format is described here: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#event_stream_format
Here's a useful tutorial showing how to send them using ASGI: https://rob-blackbourn.github.io/bareASGI/4.2/tutorial/server-sent-events/
One more endpoint: POST /-/llm/start
to start a new conversation with a prompt and a selected model (and maybe an optional system prompt too) - it creates a new conversation
with a ulid
and then redirects you to /-/llm/ULID
.
For that bit, I'm going to create a record in a special table (separate from the existing conversations
table) which records the start prompt, model, system prompt AND the actor_id
of the user who started it.
I'll use this for any other extra metadata needed by this feature beyond what's in the existing conversations
and responses
tables too.
I'm going to call that initiated
and it will look like this:
create table initiated {
id text primary key,
model_id text,
prompt text,
system text,
actor_id text,
datetime_utc text
)
I may add a options
JSON column later if it seems useful.
It's getting useful now!
The pattern where I use llm
like this:
cat caddy-release-notes.txt | llm -s 'What does Caddy do with ETags? Give issue citations'
Results in log pages with huge unreadable chunks of text like this:
I think I'm going to say that if there's a system
prompt AND the initial prompt is longer than a certain number of characters it gets truncated with a "view all" link.
I experimented with rendering the prompts as markdown too, but it had some weird effects:
Latest change fixes <ol>
display:
I changed my mind about Server-Sent-Events - I'm going to switch back to WebSockets (as seen in my earlier prototype).
I was thinking through the challenge of how to ensure that JavaScript could POST a new prompt to one endpoint and have the SSE endpoint then start serving tokens for that response, and decided that the mechanism to co-ordinate between those two was a bit too messy.
I already have WebSocket code that works from the prototype, I'm going to try that first.
This will live at
/-/llm
- it will start off as a textarea, a button and a select menu for picking the model.It will execute the prompt, stream the response to the browser and log the result to the SQLite
llm
database (actuallylogs.db
somewhere).