dfrg / swash

Font introspection, complex text shaping and glyph rendering.
Apache License 2.0
584 stars 34 forks source link

How is it zero transient heap allocation? #10

Closed kirawi closed 3 years ago

kirawi commented 3 years ago

This is largely because I am unfamiliar with shaping, but I'm also not that skilled in programming yet either. So, I was wondering how you achieved zero transient heap allocations. At least to me, it seems like it would cause a stack overflow if you had to shape a large amount of text. You mention a cache, but I'm still pretty confused. How does the cache work without transiently allocating to the heap?

dfrg commented 3 years ago

Apologies for the delay. I'm back on this project full time now.

Shaping does require heap allocations that are generally linear with respect to the size of the source text. By zero transient allocations, I mean that all scratch space (buffers, caches, temporary memory) is owned and retained by the ShapeContext and reused for subsequent shaping operations. The context does allocate, but does not free any memory until it is dropped. This is essentially a trade off-- your program will retain more allocated memory, but the cost of malloc and free will be amortized over many calls to the shaper and should very quickly approach zero. Avoiding calls to the global heap allocator is important in multithreaded scenarios and/or when heap fragmentation is an issue.

The lifetime of ShapeContext is of, course, up to the user. If you care more about resident memory than latency/heap fragmentation or if your text layout is infrequent, then it is reasonable to create a new context at the beginning of each layout pass and drop it at the end. In the case of something like a game engine, however, where you have high frequency layout/rendering, best practice is to trade some resident memory for determinism and reduced latency/fragmentation, so you'd want to keep a context for the lifetime of the program.

I admit that the architecture and API of the crate may seem a bit unusual, but the library was designed from the ground up to accommodate high performance, multithreaded scenarios. Much of the design was driven by analyzing the unfortunate contortions necessary in Gecko, Blink and Servo to incorporate HarfBuzz and FreeType-- both of which were created before multithreading became common practice.

The internal caching mechanisms in the shaper are somewhat complex, but I'd be happy to elaborate on them if that would be helpful.

kirawi commented 3 years ago

No worries, that was all I wanted to know :P