Made drawing a couple of times faster

jtdaugherty / vty

A high-level ncurses alternative written in Haskell

BSD 3-Clause "New" or "Revised" License

319 stars 57 forks source link

Made drawing a couple of times faster #244

Closed ilyakooo0 closed 1 year ago

ilyakooo0 commented 2 years ago

Stopped copying vectors around.

I don't have exact benchmarks, but after this PR clipText takes up half of the time spent on drawing, and clipText itself is not about three times faster.

jtdaugherty commented 2 years ago

Thanks for your patch!

I'm curious how you know that this is faster if you don't have benchmarks.

ilyakooo0 commented 2 years ago

I had a specific application that I was trying to optimise. I have flamegraphs of the execution of the application.

They are kind of meaningless on their own.

I did a couple of iterations:

Find a function that takes up a large amount of the flamegraph
Optimise that part
Look at whether the optimisation significantly reduced the portion of the flamegraph that the function occupies

ilyakooo0 commented 2 years ago

Plus, I just observed the subjective performance of using the app.

(I am constructing a large Image and scrolling it across the screen. 95% of time is spent drawing the Image. It went from ~1FPS to ~15FPS)

ilyakooo0 commented 2 years ago

On second though, the flamegraphs might be useful if you look at something that I didn't optimise, since everything is scaled proportionately.

(These first two images are interactive SVGs, so you can download them and explore in more detail)

Before

pre

After

prof-2

If we look at the writeAttrSet call, which I didn't touch, then we can see how many times bigger it is in the "after" compared to the "before". (and the blaze.<> operator, though I am not sure where it is coming from)

Before

Screenshot 2022-05-03 at 18 40 00

After

Screenshot 2022-05-03 at 18 38 48

jtdaugherty commented 2 years ago

Thanks! I haven't seen one of these graphs before, so I'm not sure how to read it. Is the larger writeAttrSet an indication that something else took up less time? What is the X axis? (Also, the first two don't seem to be interactive for me.)

ilyakooo0 commented 2 years ago

the larger writeAttrSet an indication that something else took up less time? What is the X axis?

The x axis is roughly the total execution time. And the horizontal bars show what portion of the execution time each function takes up. You can read more info in the "Summary" section here: https://www.brendangregg.com/flamegraphs.html

So, yes, if an unchanged function takes up more proportional time, it means that the other functions take up less time.

Also, the first two don't seem to be interactive for me.

It doesn't work in-browser for me either. When I download it and open in Gapplin, I can interact with it. I am not familiar with SVG implementation standards, so I don't really know where else you can open it for it to be interactive (:

jtdaugherty commented 2 years ago

This set of patches had fallen off of my radar so I started doing a bit of testing to refresh my memory. I tried this PR with tart and the behavior I see when trying to run the tool is:

tart: Data.Vector.Mutable: uninitialised element. If you are trying to compact a vector, use the 'Data.Vector.force' function to remove uninitialised elements from the underlying array.
CallStack (from HasCallStack):
  error, called at src/Data/Vector/Mutable.hs:215:17 in vctr-0.13.0.0-e19965ca:Data.Vector.Mutable

Interestingly, I do not see that exception on startup with matterhorn.

jtdaugherty commented 2 years ago

It's possible that exception is actually due to tart's own use of mutable vectors. I'll investigate and see whether that's the case.

jtdaugherty commented 2 years ago

Oh, I'm wrong - tart uses mutable arrays, not vectors.

ilyakooo0 commented 2 years ago

It is very much possible that I missed some corner case.

It is now evident that merging this would require a comprehensive test suite.

Unfortunately, I no longer have time to invest into this PR ;(

Maybe we could pull out some obviously correct optimisations and merge only those?

In particular, I would suggest merging the small clipText optimisation.