alexheretic / ab-glyph

Rust API for loading, scaling, positioning and rasterizing OpenType font glyphs
Apache License 2.0
372 stars 24 forks source link

Direct access to backing store #83

Closed simoncozens closed 1 year ago

simoncozens commented 1 year ago

I've got some code which rasterises some curves using ab_glyph_rasterizer and then converts it to a Rust image. On profiling the code, I found that 50% of the time was spent copying data from the rasterizer into the image object, since I only have access to the image data through for_each_pixel or for_each_pixel_2d. The fastest I could do was this:

        let dims = rasterizer.dimensions();
        let mut store = Vec::with_capacity(dims.0 * dims.1);
        rasterizer.for_each_pixel(|_, alpha| {
            let amount = (alpha * 255.0) as u8;
            store.push(amount);
        });
        let image = GrayImage::from_raw(dims.0 * dims.1, store)

But that pixel-at-a-time loop is a killer. Making the array available to the user would make it able to be blatted much more quickly into an image.

alexheretic commented 1 year ago

Rasterization is two stage

  1. processing/"drawing" the outlines.
  2. accumulating results into pixel coverage values.

The backing store does not have the pixel coverage values themselves. If you had access to it you'd still need to do similar "accumulation logic" to what for_each_pixel does.

Accumulation benches

Accumulation generally fast for small glyphs, but can be the slowest part for larger glyphs. In the benchmarks

biohazard 294x269px

rasterize_ttf_biohazard time: [69.463 µs 69.471 µs 69.479 µs] rasterize_outline_ttf_biohazard time: [14.493 µs 14.499 µs 14.505 µs] accumulate_ttf_biohazard time: [55.726 µs 55.732 µs 55.738 µs]



### Accumulation optimisation
I'd love for the accumulate stage to be faster. The current interface seems hard to optimise with SIMD for example. 

It may be possible to have a specialized `coverage_u8()` fn that goes straight to `Vec<u8>` coverage and makes use of runtime detected SIMD?

I also wanted to experiment with offering iterator interfaces and seeing if they could make the API more ergonomic without performance impact.
simoncozens commented 1 year ago

Oh, wow. I'd been assuming the rasterizer was essentially a canvas, but it's really not. So far I've been rendering words by using a single rasterizer and adding all the (positioned) curves from each of the glyphs in the word to it, thinking that would be faster. But the opposite is true, perhaps I would be better off rasterising each glyph in turn and compositing it onto the image canvas at the appropriate position.

simoncozens commented 1 year ago

rasterising each glyph in turn and compositing it onto the image canvas at the appropriate position.

Which won't work because when the canvas is quite small, glyph positions can be fractional.

alexheretic commented 1 year ago

The image example uses _abglyph to draw a sentence onto an image canvas.

Each glyph is using a Rasterizer as big as the glyph's px_bounds, the coverage values then written/blended into the image.

simoncozens commented 1 year ago

The image example uses ab_glyph to draw a sentence onto an image canvas.

Thanks for that. I had seen it before, and started out with that code, but I am trying to integrate harfbuzz_rs for layout of complex text and things were going wrong with the scaling, so I gave up and used ab_glyph_rasterizer directly.

Now I've come back to look at it again, sat down and worked out out how HB and AB are scaling differently (height versus ascent), and finally I have something that works and is impressively fast (340k words per second).

So I'm done with this bug, unless of course you want to make accumulation faster anyway...

alexheretic commented 1 year ago

So I'm done with this bug, unless of course you want to make accumulation faster anyway...

🙂 I absolutely want to make it faster. But i can track ideas to do that with separate issues. I'm glad you've sorted out your issue!