Closed andydotxyz closed 1 month ago
The code looks good, just one inline comment. Also, is there any chance that you could add some benchmarks to more easily verify the performance improvements. I mean, yes, I do feel that it is faster but with performance improvements you really need solid benchmarks to base it on. Not only to make it easier to see that things got faster in this case but also to see improvements in upcoming changes plus making sure that we don't regress going forward.
I don't know how to write benchmarks that go through the OpenGL texture cache usage... Won't we need a window to appear and run various redraw scenarios before closing again and reporting back?
I don't know how to write benchmarks that go through the OpenGL texture cache usage... Won't we need a window to appear and run various redraw scenarios before closing again and reporting back?
I don't necessarily see a problem with opening a window and reporting back. We only run benchmarks locally (CI is too fluctuating due to running in containers) and it would be very helpful for all the reasons stated above.
I'm more or less of the opinion that optimisations without benchmarks aren't justified. From what I've seen of the upstream Go compiler development, they basically won't accept alleged performance improvements without a benchmark (aside from trivial changes that are about as much code cleanup and refactoring).
Taking the time to write some proper, reliable benchmarking infrastructure really should be step one in my opinion and ir will pay off in the long run. The system monitor just isn't reliable enough.
To be clear: I'm not necessarily saying that I will block this due to missing a benchmark but I think one should be added if possible.
As a next step, we really should take the time to implement better benchmarking infrastructure for this area if it isn't possible to do at the moment and then change our review policy to require benchmarks for any significant changes.
I agree that benchmarking should be improved to make this sort of thing easier, but what is the point of benchmarks that are off by default? If we want them to verify the speed or improvements stay in place then it should at least be on some sort of nightly CI run or they may never be run again.
In this instance I had hoped that being so clearly 2* faster but in an area where benchmarking has not been established yet we might slip by and move on to the next step. I can look into now the CPU usage during rendering can be captured, but given we draw at a specific frame rate then this is not on the sort of "tight loop" that makes for easy benchmarking. We will have to do comparison of computer load and usages under some sort of simulated runtime I would think...?
I agree that benchmarking should be improved to make this sort of thing easier, but what is the point of benchmarks that are off by default?
Manual benchmarks are most useful when you are working on trying to improve performance in an area like this. They can be used for example when opening a PR like this to compare your changes against the previous state. When you have implemented a further improvement, you can run the benchmarks again to verify that it is indeed faster and see by how much. The people reviewing can also run them locally if they want and see if they see the same gains. With benchmark data about allocations you also quite easily tweak the code until you get less allocations. It basically makes the process of improving code performance faster as some of the benchmarking is done automatically for you.
However, I do agree that it would be useful to have some sort of automatic benchmarking online but we might need a better CI infrastructure set up if we want consistency. What we had previously was causing headaches as it failed randomly because of being too slow on CI. The best option is probably to have some sort of listing that tracks performance per commit over time so you can see general trends. I think the Go compiler and some other projects have something similar.
In this instance I had hoped that being so clearly 2* faster but in an area where benchmarking has not been established yet we might slip by and move on to the next step.
While perhaps not super related to this given the large shown performance improvement, my biggest problem with missing benchmarks is that it can be very hard to look at the code and then have it be obvious that it actually improves performance. We may or may not be blinded by the fact that we think the changes should improve performance but what we actually feel is placebo because of that. In this case, I really don't understand the code well enough to understand why it improves performance but that might partly just be the fact that I haven't worked with the cache code that much.
Going by system-monitor data (fluctuates a lot) and feel certainly shows a big improvement here and I don't deny that. I just feel like we have historically been neglecting proper benchmarking throughout the toolkit for a long time now. I strongly believe that this and many other performance improvements that are needed in the toolkit would have been much easier to implement with better tooling.
Thanks, agreed. I will try to sit down and understand how to improve benchmarking soon.
It's such a big challenge that I am working on this in stages.
First: Move text cache index to the text details not object reference
Using Terminal app as benchmark (TextGrid gets the biggest boost with this change) when running "top" fullscreen:
Next improvements will build on this so other text widgets see bigger improvements.
Checklist: