First stage in text speed improvements

andydotxyz commented 1 month ago

It's such a big challenge that I am working on this in stages.

First: Move text cache index to the text details not object reference

Using Terminal app as benchmark (TextGrid gets the biggest boost with this change) when running "top" fullscreen:

around 20% less memory
up to 67% lower CPU usage
App resize feels faster too :)

Next improvements will build on this so other text widgets see bigger improvements.

Checklist:

[ ] Tests included. <- as long as existing tests all work this speedup is good :)
[x] Lint and formatter run with no errors.
[x] Tests all pass.

coveralls commented 1 month ago

coverage: 59.892% (-0.04%) from 59.934% when pulling fe31121d3d1dcaa5c2b45d1cd4810f225ab9a969 on fix/textspeed1 into 1325f0d843fc4db46c30b312f92b352f844b81f8 on develop.

andydotxyz commented 1 month ago

The code looks good, just one inline comment. Also, is there any chance that you could add some benchmarks to more easily verify the performance improvements. I mean, yes, I do feel that it is faster but with performance improvements you really need solid benchmarks to base it on. Not only to make it easier to see that things got faster in this case but also to see improvements in upcoming changes plus making sure that we don't regress going forward.

I don't know how to write benchmarks that go through the OpenGL texture cache usage... Won't we need a window to appear and run various redraw scenarios before closing again and reporting back?

Jacalz commented 1 month ago

I don't know how to write benchmarks that go through the OpenGL texture cache usage... Won't we need a window to appear and run various redraw scenarios before closing again and reporting back?

I don't necessarily see a problem with opening a window and reporting back. We only run benchmarks locally (CI is too fluctuating due to running in containers) and it would be very helpful for all the reasons stated above.

I'm more or less of the opinion that optimisations without benchmarks aren't justified. From what I've seen of the upstream Go compiler development, they basically won't accept alleged performance improvements without a benchmark (aside from trivial changes that are about as much code cleanup and refactoring).

Taking the time to write some proper, reliable benchmarking infrastructure really should be step one in my opinion and ir will pay off in the long run. The system monitor just isn't reliable enough.

Jacalz commented 1 month ago

To be clear: I'm not necessarily saying that I will block this due to missing a benchmark but I think one should be added if possible.

As a next step, we really should take the time to implement better benchmarking infrastructure for this area if it isn't possible to do at the moment and then change our review policy to require benchmarks for any significant changes.

andydotxyz commented 1 month ago

I agree that benchmarking should be improved to make this sort of thing easier, but what is the point of benchmarks that are off by default? If we want them to verify the speed or improvements stay in place then it should at least be on some sort of nightly CI run or they may never be run again.

In this instance I had hoped that being so clearly 2* faster but in an area where benchmarking has not been established yet we might slip by and move on to the next step. I can look into now the CPU usage during rendering can be captured, but given we draw at a specific frame rate then this is not on the sort of "tight loop" that makes for easy benchmarking. We will have to do comparison of computer load and usages under some sort of simulated runtime I would think...?

Jacalz commented 1 month ago

I agree that benchmarking should be improved to make this sort of thing easier, but what is the point of benchmarks that are off by default?

Manual benchmarks are most useful when you are working on trying to improve performance in an area like this. They can be used for example when opening a PR like this to compare your changes against the previous state. When you have implemented a further improvement, you can run the benchmarks again to verify that it is indeed faster and see by how much. The people reviewing can also run them locally if they want and see if they see the same gains. With benchmark data about allocations you also quite easily tweak the code until you get less allocations. It basically makes the process of improving code performance faster as some of the benchmarking is done automatically for you.

However, I do agree that it would be useful to have some sort of automatic benchmarking online but we might need a better CI infrastructure set up if we want consistency. What we had previously was causing headaches as it failed randomly because of being too slow on CI. The best option is probably to have some sort of listing that tracks performance per commit over time so you can see general trends. I think the Go compiler and some other projects have something similar.

In this instance I had hoped that being so clearly 2* faster but in an area where benchmarking has not been established yet we might slip by and move on to the next step.

While perhaps not super related to this given the large shown performance improvement, my biggest problem with missing benchmarks is that it can be very hard to look at the code and then have it be obvious that it actually improves performance. We may or may not be blinded by the fact that we think the changes should improve performance but what we actually feel is placebo because of that. In this case, I really don't understand the code well enough to understand why it improves performance but that might partly just be the fact that I haven't worked with the cache code that much.

Going by system-monitor data (fluctuates a lot) and feel certainly shows a big improvement here and I don't deny that. I just feel like we have historically been neglecting proper benchmarking throughout the toolkit for a long time now. I strongly believe that this and many other performance improvements that are needed in the toolkit would have been much easier to implement with better tooling.

andydotxyz commented 1 month ago

Thanks, agreed. I will try to sit down and understand how to improve benchmarking soon.

fyne-io / fyne

First stage in text speed improvements #5190

Checklist: