Open FD- opened 2 years ago
Wow, thanks so much for the perf analysis @FD- ! We run the same Pixlet runtime in our backend to render all apps for Tidbyt, so we certainly care about performance 😅. To date, we've thrown CPU at the problem but we're quickly running up against the limits of that approach. For number one, let me fork the repo to the Tidbyt organization and we can make the changes there. For number two I'd need to spend a bit more time poking around but it certainly sounds logical.
I forked the repo here: https://github.com/tidbyt/gg. More then happy to review a PR or pitch in with your suggested changes.
Wow, I had no idea you were running this code on your backend! IMHO, your current implementation could be optimised quite a bit (easily 2-3x performance across applets) with only a few days of effort.
I might have a look at implementing improvement 1. I also thought a little more about improvement 2, and here are a few points:
The widget interface needs changes along these lines:
// PaintBounds Returns the bounds of the area that will actually be drawn to when Paint() is called
PaintBounds(bounds image.Rectangle, frameIdx int) image.Rectangle
Paint(dc *gg.Context, bounds image.Rectangle, frameIdx int)
PaintBounds is needed so we can calculate the dimensions of a widget before drawing it. In fact, the current implementation does a full render just to estimate sizes, which is very inefficient AFAICT.
We can push() the context before calling a child's Paint() function and pop() afterwards.
Every Paint() implementation should be responsible for centering its drawing area within the bounds passed as argument. Currently, this is done by the respective parent.
Is clipping really needed when painting children? Surely adds some cost as well.
It's worth noting that improvement 2 will render improvement 1 mostly ineffective (because we're no longer drawing the starlark image into an empty context). Still, improvement 2 should offer a speedup for all applets, while improvement 1 only helps for applets that draw large images.
Alright, I implemented improvement 1 in https://github.com/tidbyt/gg/pull/1.
Looking at profiles from production, I know for a fact that improvement 2 would be pretty huge.
I have an implementation for improvement 2. Huge speedup indeed. Will iron out the last bugs and open a PR later today or tomorrow.
Alright, improvement 2 done and tested. See PR #355. I’m seeing 2-3x speedup on average.
Maybe send me a Tidbyt when your server savings exceed the unit’s price :D
I'm running pixlet on a Raspberry Pi Zero as a makeshift self-hosted server for private apps. Execution of
pixlet render
currently takes ages on this admittedly slow hardware, so I started to investigate a bit. As part of this investigation, I discovered 2 possible improvements for render performance.Improvement 1 (speedup for applets using the Image widget)
I started my investigations on the reddit_r_place community app. As-is, pixlet on the Raspberry Pi Zero takes about 37 seconds to render the applet:
I used
runtime/pprof
for obtaining detailed runtime statistics. Immediately, it was apparent that something was strange with all these image.(*NRGBA).Opaque calls. In total, the CPU spent 20 seconds running that function!Following the pixlet source code, I could see that the relevant portion in pixlet was in
transformation.go
, where the image specified by the starlark applet'sImage
element is passed to the gg library's DrawImage function.https://github.com/tidbyt/pixlet/blob/ecae45a3e2a4e2c84ec3aeefa3cefa7b9623f8f2/render/animation/transformation.go#L238
This function in turn calls
DrawImageAnchored
, which usesgolang.org/x/image/draw
for the actual drawing:Note that
draw.Over
is passed as the Op, which pixlet doesn't even need here.draw.Src
would suffice, since we are drawing into an empty image anyway. This now gets interesting when we have a look at the implementation of the called function insidegolang.org/x/image/draw
:We've found the source of these expensive
image.(*NRGBA).Opaque
calls, and they are not even needed by pixlet. If we had a way to specify the Op, we would actually preferdraw.Src
here. For testing purposes, I changed go's local copy ofgithub.com/fogleman/gg
to always usedraw.Src
, and ran the initial test again:Quite some improvement, I would say! Now, the problem is: I don't know how this performance issue could sensibly be fixed. The cleanest way would probably be having the
github.com/fogleman/gg
library export a function that allows specifying the Op parameter, but it seems like that library has been abandoned (no new commits for a year).Improvement 2 (speedup for deeply nested applet UIs)
Currently, the
Paint
method of every widget creates a new image, which it returns to the caller, who will then just render that image into another image, ... until the root draws it into an image that has a background. I haven't thought that through (or implemented it), but shouldn't it be possible to pass the render context to everyPaint
method and have all draw operations on the same context? Given the bad performance of thegg.context.DrawImage
function, IMHO it should be called as rarely as possible.With improvement 1 from above implemented (the ugly way described above), the two
gg.context.DrawImage
calls per frame are clearly visible (1.3 seconds), and this is for an applet that is not nested at all.I don't know whether performance is something you are concerned about for this implementation, but I still wanted to leave my notes here for others to find as well.
Other than that: I have to say the applet development experience for the Tidbyt platform is great (even if deployment of private applets is a pain ATM)!