Closed simoncozens closed 1 year ago
Rasterization is two stage
The backing store does not have the pixel coverage values themselves. If you had access to it you'd still need to do similar "accumulation logic" to what for_each_pixel
does.
Accumulation generally fast for small glyphs, but can be the slowest part for larger glyphs. In the benchmarks
rasterize_foo
- total rasterize timerasterize_outline_foo
- draw time (no accumulation)accumulate_foo
- accumulation time (no draw)
# ttf_w 9x8px
rasterize_ttf_w time: [290.76 ns 290.93 ns 291.14 ns]
rasterize_outline_ttf_w time: [239.18 ns 239.27 ns 239.36 ns]
accumulate_ttf_w time: [53.192 ns 53.224 ns 53.257 ns]
rasterize_ttf_biohazard time: [69.463 µs 69.471 µs 69.479 µs] rasterize_outline_ttf_biohazard time: [14.493 µs 14.499 µs 14.505 µs] accumulate_ttf_biohazard time: [55.726 µs 55.732 µs 55.738 µs]
### Accumulation optimisation
I'd love for the accumulate stage to be faster. The current interface seems hard to optimise with SIMD for example.
It may be possible to have a specialized `coverage_u8()` fn that goes straight to `Vec<u8>` coverage and makes use of runtime detected SIMD?
I also wanted to experiment with offering iterator interfaces and seeing if they could make the API more ergonomic without performance impact.
Oh, wow. I'd been assuming the rasterizer was essentially a canvas, but it's really not. So far I've been rendering words by using a single rasterizer and adding all the (positioned) curves from each of the glyphs in the word to it, thinking that would be faster. But the opposite is true, perhaps I would be better off rasterising each glyph in turn and compositing it onto the image canvas at the appropriate position.
rasterising each glyph in turn and compositing it onto the image canvas at the appropriate position.
Which won't work because when the canvas is quite small, glyph positions can be fractional.
The image example uses _abglyph to draw a sentence onto an image canvas.
Each glyph is using a Rasterizer
as big as the glyph's px_bounds
, the coverage values then written/blended into the image.
The image example uses ab_glyph to draw a sentence onto an image canvas.
Thanks for that. I had seen it before, and started out with that code, but I am trying to integrate harfbuzz_rs for layout of complex text and things were going wrong with the scaling, so I gave up and used ab_glyph_rasterizer directly.
Now I've come back to look at it again, sat down and worked out out how HB and AB are scaling differently (height versus ascent), and finally I have something that works and is impressively fast (340k words per second).
So I'm done with this bug, unless of course you want to make accumulation faster anyway...
So I'm done with this bug, unless of course you want to make accumulation faster anyway...
🙂 I absolutely want to make it faster. But i can track ideas to do that with separate issues. I'm glad you've sorted out your issue!
I've got some code which rasterises some curves using
ab_glyph_rasterizer
and then converts it to a Rustimage
. On profiling the code, I found that 50% of the time was spent copying data from the rasterizer into the image object, since I only have access to the image data throughfor_each_pixel
orfor_each_pixel_2d
. The fastest I could do was this:But that pixel-at-a-time loop is a killer. Making the array available to the user would make it able to be blatted much more quickly into an image.