tiny-pilot / tinypilot

Use your Raspberry Pi as a browser-based KVM.
https://tinypilotkvm.com
MIT License
2.99k stars 250 forks source link

Proof of concept: Process pasted keystrokes asynchronously in background thread #1618

Closed jdeanwallace closed 11 months ago

jdeanwallace commented 12 months ago

Related https://github.com/tiny-pilot/tinypilot/issues/1026

This (dirty) PR demonstrates the following:

  1. Processing a large amount of keystrokes in a green thread: https://github.com/tiny-pilot/tinypilot/blob/f6b71015b1642855db946218a56fd1308916343b/app/socket_api.py#L59
  2. Maintaining the order of keystrokes as they are written: https://github.com/tiny-pilot/tinypilot/blob/f6b71015b1642855db946218a56fd1308916343b/app/socket_api.py#L51-L58
  3. Limiting the execution time of each keystroke written by using processes: https://github.com/tiny-pilot/tinypilot/blob/f6b71015b1642855db946218a56fd1308916343b/app/hid/write.py#L87

This PR was trigger by the following comment:

It seems like even if the work is happening in start_background_task, it still locks up the main Flask thread.

It's possible that we could solve this if we use our home-rolled replacement for eventlet.Timeout using multiprocess.Process (like we already do in hid/write.py).

Demo:

https://github.com/tiny-pilot/tinypilot/assets/6730025/c7be4c57-a1f8-4a54-9f35-110c238eb9cc

The demo video shows the following:

  1. With our ping_interval set to 1 second, the PING/PONG messages do still drift slightly beyond 1 second. This might be due to the overhead of starting and stopping processes.
  2. In the terminal, we can see that the execution time of each process is constantly being limited to 0.1s even when the process is performing heavy processing: https://github.com/tiny-pilot/tinypilot/blob/f6b71015b1642855db946218a56fd1308916343b/app/hid/write.py#L65-L68

Review on CodeApprove

jdeanwallace commented 12 months ago
Automated comment from CodeApprove ➜

⏳ @mtlynch please review this Pull Request

mtlynch commented 12 months ago

Cool, this seems promising!

Can we refactor our timeout logic so that it has the same context handler semantics as eventlet.Timeout except ours actually works? We can also take shortcuts to assume the timeout is always going to be some hardcoded value if that makes things easier.

I think we're also realizing from this experience that in a paste, we want to bail on the first failure. If you're pasting 1000 characters, you don't want to wonder if some random characters in the middle got dropped and then TinyPilot happily continued along.

jdeanwallace commented 12 months ago

@mtlynch

Can we refactor our timeout logic so that it has the same context handler semantics as eventlet.Timeout except ours actually works? We can also take shortcuts to assume the timeout is always going to be some hardcoded value if that makes things easier.

Kind of, but I wasn't able to implement a context handler (i.e., with timeout:) because we need a handle on the function to limit its execution time. The way eventlet.Timeout seems to work is by interrupting the coroutine event loop and not the function itself, so I don't think we can use a context handler.

I was able to use a decorator to mark function definitions with a static timeout: https://github.com/tiny-pilot/tinypilot/blob/2fa2169644374f10502150233fd189d4bd25d298/app/hid/write.py#L101-L107


I think we're also realizing from this experience that in a paste, we want to bail on the first failure. If you're pasting 1000 characters, you don't want to wonder if some random characters in the middle got dropped and then TinyPilot happily continued along.

Okay interesting. On master, we are skipping/ignoring unsupported characters like or . Do we also want to bail on these or only on disk write errors?

mtlynch commented 12 months ago

Can we refactor our timeout logic so that it has the same context handler semantics as eventlet.Timeout except ours actually works? We can also take shortcuts to assume the timeout is always going to be some hardcoded value if that makes things easier.

Kind of, but I wasn't able to implement a context handler (i.e., with timeout:) because we need a handle on the function to limit its execution time.

Oh, darn.

I was able to use a decorator to mark function definitions with a static timeout:

I think decorator isn't really worth the trouble. It's more unusual in our codebase, but I don't think it really makes the code clearer. The timeout function you have looks good, but I'd like to move it to a separate file so the timeout logic and the HID logic are cleanly separated.

On master, we are skipping/ignoring unsupported characters like “ or ”. Do we also want to bail on these or only on disk write errors?

Ah, I didn't realize this. We should bail on these too instead of silently skipping them.

jdeanwallace commented 12 months ago

@mtlynch

I think decorator isn't really worth the trouble. It's more unusual in our codebase, but I don't think it really makes the code clearer. The timeout function you have looks good, but I'd like to move it to a separate file so the timeout logic and the HID logic are cleanly separated.

Okay cool. Questions:

  1. Should I continue with this PoC or are we happy with the tradeoffs of this approach?
  2. Do we still want to move our keystroke parsing server-side as part of https://github.com/tiny-pilot/tinypilot/issues/1026?

I don't think we actually need to use any coroutine/eventlet code here and we could probably go straight to normal threads for processing 🤔 https://github.com/tiny-pilot/tinypilot/blob/2fa2169644374f10502150233fd189d4bd25d298/app/socket_api.py#L59

mtlynch commented 12 months ago

Should I continue with this PoC or are we happy with the tradeoffs of this approach?

I think we should move forward with the timeout refactoring (as its own PR). It seems like we need it regardless, and I think it simplifies the work of fixing the paste bug.

Do we still want to move our keystroke parsing server-side as part of #1026?

It's either that or try to queue client-side, right? I think server-side is the better option at this point.