jtdaugherty / vty

A high-level ncurses alternative written in Haskell
BSD 3-Clause "New" or "Revised" License
319 stars 57 forks source link

perf: Process chunks of bracketed pastes a linear number of times. #236

Closed iphydf closed 2 years ago

iphydf commented 2 years ago

Fixes #231.

Overall, vty is now less than 2x slower than stty -echo; wc -c.

The current code is O(n^2) because it processes the entire input each time a new chunk arrives. This gets very expensive for large inputs.

This commit changes the code to process each chunk exactly twice: once to see if it contains the end of the bracketed paste, and another time to parse the whole paste and return it as an event.

The code can now process more than 1MB/sec. At this point, my tests are probably more network bound, so these timings might be off, and instead be testing my network throughput. Good enough for me anyway :).

image

Timing:

iphydf commented 2 years ago

I finished my own review. This is ready for review.

I was a bit hesitant on renaming dropInvalid, because it does more now than just dropping invalid sequences (it moves chunks from unprocessed to classifier state), but I kept the name. If you have a better proposal, I can rename it.

iphydf commented 2 years ago

In fact, my timings are likely very far off: my Haskell process is only using 6% CPU for the duration of that paste. Now it's network and terminal throughput bound.

Indeed: the terminal process uses 100% CPU, and the ssh client 20% of another core.

jtdaugherty commented 2 years ago

This looks okay to me. I also tested it in moderate anger with matterhorn and didn't run into any issues. As for dropInvalid, I would like to come up with one that better describes what it is taking care of, but I don't have any ideas right now. I'm going to merge this. Thanks again for your investigations, detailed findings, and patches!

jtdaugherty commented 2 years ago

This change is now released in 5.35.