Open Steve235lab opened 1 month ago
This has annoyed me for a long time since the very first time I use OI, and this could be a not perfect but working solution. Just pull and give it a try, you'll know what I'm talking about.
This is so cool, its been an issue for a while. thanks @Steve235lab
Hi @Steve235lab, this is fantastic. I am annoyed by the original behavior as well! But I want to float two other solutions.
I think the streaming is an important UX component to lots of modern AI systems, and I think we can fix the issue in two other ways:
--plain
— a flag that just removes Rich. It merely would merely print(chunk, end="")
the chunks as plain text, more like Ollama. Would also work if someone wanted to pipe OI's output into something else. This should fix all problems, unless there's something deeper about the rate of streaming that's bad for SSH!What do you think?
--plain — a flag that just removes Rich. It merely would merely print(chunk, end="") the chunks as plain text, more like Ollama. Would also work if someone wanted to pipe OI's output into something else. This should fix all problems, unless there's something deeper about the rate of streaming that's bad for SSH!
This one is great, I will implement this later.
Describe the changes you have made:
Add a new terminal option which allows users to config whether rendering responses while receiving chunks (classic and default behavior) or perform a one-time rendering after all chunks were received (new behavior).
Perform a one-time rendering after all chunks were received will prevent showing duplicate lines in terminal and especially when using via SSH it will reduce bandwidth usage and twinkling.
Reference any relevant issues (e.g. "Fixes #000"):
Temporally fixes #1127
Pre-Submission Checklist (optional but appreciated):
docs/CONTRIBUTING.md
docs/ROADMAP.md
OS Tests (optional but appreciated):