jasonjmcghee / plock

From anywhere you can type, query and stream the output of an LLM or any other script
MIT License
474 stars 31 forks source link

streaming is not working #29

Open Mia-Zhang-Forever opened 2 months ago

Mia-Zhang-Forever commented 2 months ago

I have set up a command like this:

 19     {
 20       "command": [
 21         "bash",
 22         "/Users/Mia/Desktop/src/test/greet.sh"
 23       ]
 24     }

the greet.sh is:

#!/bin/bash
/opt/homebrew/bin/python3 /Users/Mia/Desktop/src/test/greet.py

the greet.py is:

for i in range(5):
    import random

    # Generate a random line
    random_line = ''.join(random.choice('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!@#$%^&*()') for _ in range(random.randint(5, 20)))

    print(f"Random line {i + 1}: {random_line}")
    import time
    time.sleep(2)

When I invoke this shortcut (Ctrl+Cmd+G), it waits for 10 seconds before it can return the result. How can I make it respond as soon as the text is printed?

image
jasonjmcghee commented 2 months ago

I believe the problem is with the python aspect of this.

I think one of two things would fix this:

Add "PYTHONUNBUFFERED": "1" to environment in settings.json

Or

Add flush=true as an argument to print

Mia-Zhang-Forever commented 2 months ago

I believe the problem is with the python aspect of this.

I think one of two things would fix this:

Add "PYTHONUNBUFFERED": "1" to environment in settings.json

Or

Add flush=true as an argument to print

I actually tried disabling the IO buffer before creating this issue. I watched the video you posted, and I even attempted to use the gpt.sh script, but it doesn't work for me. Therefore I started to debug a bit:

I added some logs: There are only two read events in generate, We still need to wait for 6 seconds for the rest of the content.

image
Mia-Zhang-Forever commented 2 months ago

First of all, thank you for this incredible project. If it's not currently a high priority for you, I'm willing to delve deeper into the issue myself. It would be immensely helpful if you could provide me with some guidance or pointers.

jasonjmcghee commented 2 months ago

I think one of the key issues is it's filling a buffer, not greedily outputting anytime it sees a newline, which could help here

jasonjmcghee commented 2 months ago

This is where it's yielded to stdout

https://github.com/jasonjmcghee/plock/blob/210e3e274a18c0653446df454fd6901beeac82ca/src-tauri/src/generator.rs#L105

Any help would be much appreciated!

Mia-Zhang-Forever commented 2 months ago

This is actually blocked by error handling. We should definitely use tokio::select! to watch both cases.

                    err_buffer.clear();
                    let mut err_buf = [0; 1024]; // Temporary buffer for each read
                    if let Ok(size) = std_err_reader.read(&mut err_buf).await {
                        err_buffer.extend_from_slice(&err_buf[..size]);
                        yield String::from_utf8_lossy(&err_buffer).to_string();
                    } else {
                        should_break = true;
                    }

The above code snippet will block the current thread. When the script finishes, it might emit an empty event, which will trigger the loop again, I've change the code like this:

                    tokio::select! {
                        result = reader.read(&mut temp_buf) => {
                            match result {
                                Ok(0) => { should_break = true }, // EOF reached
                                Ok(size) => {
                                    buffer.extend_from_slice(&temp_buf[..size]);
                                    yield String::from_utf8_lossy(&buffer).to_string();
                                },
                                Err(e) => {
                                    eprintln!("Error reading from stdout: {}", e);
                                    break;
                                }
                            }
                        },

                        result = std_err_reader.read(&mut err_buf) => {
                            match result {
                                Ok(size) => {
                                    err_buffer.extend_from_slice(&err_buf[..size]);
                                    yield String::from_utf8_lossy(&err_buffer).to_string();
                                }
                                Err(e) => {
                                    eprintln!("Error reading from stderr: {}", e);
                                    break;
                                }
                            }
                        },
                    }

Another modification is that the delta_buffer has a length requirement in order to be printable:

                        let delta_output = {
                            if delta_buffer.len() > 4 {
                                let s = delta_buffer.clone().join("");
                                delta_buffer.clear();
                                s
                            } else {
                                "".to_string()
                            }
                        };
jasonjmcghee commented 1 month ago

If / when you get things to a state you believe they are working properly, please do open a pr. I will gladly review / test.

I believe some bugs were introduced into 0.13 (as you point out with error handling), and the streaming was originally built for LLM streaming, which it appeared to work properly for in 0.12. But I definitely agree that it should behave how the terminal would for newlines as well.

As far as the delta output - i believe that was to play well with input simulation. I regret not adding more documentation.