simonw / llm

Access large language models from the command-line
https://llm.datasette.io
Apache License 2.0
4.81k stars 266 forks source link

Rich markdown formatting (including streaming) in any mode with `--rich` #571

Open gianlucatruda opened 2 months ago

gianlucatruda commented 2 months ago

Overview

Fixes #12

This builds on the excellent foundation that @juftin laid down in #278 and

  1. fixes a mysterious bug that was failing some tests and
  2. resolves merge conflicts caused by changes since #278 was proposed.

I love llm and use it constantly. My only gripe has been the lack of rich formatting in the terminal. I recently used rich for a project and found it excellent, so I was excited to add this to llm. I found #278 was open but dormant, so I decided to nudge things along.

@simonw thanks for your amazing tools and awsome blog!

Screenshots

SCR-20240912-rpqh SCR-20240912-rpxr SCR-20240912-rqws
gianlucatruda commented 2 months ago

Here's a demo gif of streaming working with rich output: llm-with-rich-demo

gianlucatruda commented 2 months ago

Update: I've added pytest tests for making sure that --rich mode works as intended. I also factored in release 0.16 commits. All 185 tests pass.

@simonw is there anything else this needs in order to be merged? That would allow you to close #12

irthomasthomas commented 2 months ago

What is the benefit versus piping to something? I like the idea of keeping the main project as light as possible.

Screenshot_20240916_180124-1

dzmitry-kankalovich commented 2 months ago

@irthomasthomas likely the difference is in syntax highlighting of a partial / streamed LLM response.

I've been piping llms output to glow for the past months, but the drawback is that you see the result only when streaming is completed, and until that there is like no output. It's a subpar UX when you need to wait some dozens of seconds to see the result.

As I get it from @gianlucatruda examples here this particular problem was solved.

gianlucatruda commented 2 months ago

I like the idea of keeping the main project as light as possible.

I normally would agree, @irthomasthomas. But as @dzmitry-kankalovich correctly points out, piping breaks streaming, which is a major drawback to usability. I think this justifies the choice.

irthomasthomas commented 2 months ago

Are you talking about the ansi codes being injected?

On Mon, 16 Sept 2024, 18:49 Gianluca Truda, @.***> wrote:

I like the idea of keeping the main project as light as possible.

I normally would agree, @irthomasthomas https://github.com/irthomasthomas. But as @dzmitry-kankalovich https://github.com/dzmitry-kankalovich correctly points out, that breaks streaming, which is a major drawback to usability. I think this justifies the choice.

— Reply to this email directly, view it on GitHub https://github.com/simonw/llm/pull/571#issuecomment-2353544773, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE476NE4CDZSB6EHNVAT3ULZW4K37AVCNFSM6AAAAABOD5UQFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJTGU2DINZXGM . You are receiving this because you were mentioned.Message ID: @.***>

gianlucatruda commented 2 months ago

Are you talking about the ansi codes being injected?

@irthomasthomas When you pipe the output of llm to another application that renders markdown (which may do ANSI code injection), you have to wait for the entire LLM response, which could be several seconds or even minutes. And in chat mode, it's not possible at all. So it's not a viable solution.

This PR enables llm to do the rich textual rendering itself in a way that supports response streaming. That means the user sees the llm output in realtime, rendered prettily, as it arrives from the LLM. It also allows this rich text streaming to work in chat mode (as seen in my screenshots).

Overall, this PR adds functionality to llm that is not possible when piping to other tools. It's a massive upgrade to the user experience and something that has been requested by many people for a long time.

irthomasthomas commented 2 months ago

That's not true. The example I gave, I'm using highlight and that displays the rendered markdown as it streams in.

On Mon, 16 Sept 2024, 20:16 Gianluca Truda, @.***> wrote:

Are you talking about the ansi codes being injected?

@irthomasthomas https://github.com/irthomasthomas When you pipe the output of llm to another application that renders markdown (which may do ANSI code injection), you have to wait for the entire LLM response, which could be several seconds or even minutes. And in chat mode, it's not possible at all. So it's not a viable solution.

This PR enables llm to do the rich textual rendering itself in a way that supports response streaming https://github.com/simonw/llm/pull/571#issuecomment-2347085526. That means the user sees the llm output in realtime, rendered prettily, as it arrives from the LLM. It also allows this rich text streaming to work in chat mode https://private-user-images.githubusercontent.com/1952799/367031734-b556db5f-838c-4544-9091-429a8e199b3c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjY1MTM4NTEsIm5iZiI6MTcyNjUxMzU1MSwicGF0aCI6Ii8xOTUyNzk5LzM2NzAzMTczNC1iNTU2ZGI1Zi04MzhjLTQ1NDQtOTA5MS00MjlhOGUxOTliM2MucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDkxNiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA5MTZUMTkwNTUxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NGViNzY4MTZmZmQ2NmQxODAxNTAxMmNjZTMyYmI4OWJkNWNlNmZiZWRmYjUzZjJmMGUyNmM2MThmNzQ0ZmRhMSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.N-bsTtyPFqoavRBubZW8xg8_I2X3p7u5c4xy5A4Zxfw .

Overall, this PR adds functionality to llm that is not possible when piping to other tools. It's a massive upgrade to the user experience and something that has been requested by many people for a long time.

— Reply to this email directly, view it on GitHub https://github.com/simonw/llm/pull/571#issuecomment-2353720590, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE476NDTOGGRZ4EPXPSHXT3ZW4VAVAVCNFSM6AAAAABOD5UQFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJTG4ZDANJZGA . You are receiving this because you were mentioned.Message ID: @.***>

dzmitry-kankalovich commented 2 months ago

@irthomasthomas I just checked, and indeed highlight unlike glow does process stream responses.

however it... does not render markdown?

it just highlights (I guess hence the name) markdown blocks, but it does not render them - at least not like glow or rich.

It is somewhat better than just plain text, but it does not feel as convenient as these other alternatives.

gianlucatruda commented 2 months ago

That's not true. The example I gave, I'm using highlight and that displays the rendered markdown as it streams in.

@irthomasthomas

  1. can you link to the source for installing highlight? If it's cli-highlight, then I'm unable to replicate the streaming you claim.
  2. your screenshots with piping to highlight are showing syntax highlighting, not markdown rendering as #12 is asking for and this PR provides.
  3. can you provide an example showing evidence that streaming works with piping in both normal and chat modes?
gianlucatruda commented 2 months ago

@simonw let me know if you have any feedback on this PR. Happy to make any changes necessary.