Possible performance improvements

ymarco commented 1 year ago

Hi. As you would already know, great efforts are currently being made to improve Emacs's editing experience of org-mode files: specifically these files can contain latex fragments, and @tecosaur and @karthink created an entire system for rendering these fragments in-editor. The previews that are generated make the document much more readable. The previews work by compiling the the math in the document to a dvi, converting the dvi to images with dvipng or dvisvgm, and embedding these images in place of the original math text in Emacs.

We want the whole process to be as quick as possible, to have the smallest delay after a user finishes typing math for the preview to be displayed. We're already caching the preamble for example, to accelerate the dvi generation.

So now, the next thing we're trying to accelerate is the image generation. dvipng is faster than dvisvgm in this regard, but we prefer svgs since we can scale them better when the user zooms in, or change their color if the math is e.g part of a heading that's colored differently in the buffer. So, I (@ymarco) tried to see if we could optimize dvisvgm.

I profiled using kcachegrind:

kcachegrind-uncache

Long story short, if we reuse glyph nodes that we've already calculated there's a x1.3 performance improvement to be had, depending on the input dvi:

| amount of math fragments | normal dvisvgm run time | run time with glyph cache added (this patch) |
|                        1 | 130ms                   | 131ms                                        |
|                      113 | 3.2s                    | 2.4s                                         |

So, any thoughts? I'm pretty bad with C++ and I tried to change as little as possible in the codebase; the cache I've implemented is pretty dumb, but I think there's potential here.

mgieseki commented 1 year ago

Thank you for taking the time to profile the runtime and for providing the PR. Caching the extracted glyph paths can indeed speed up the execution. Actually, dvisvgm already utilizes a font cache but, at the moment, it's only used for glyphs created by vectorizing Metafont's bitmap output. The glyph data is stored in files so that it can be used across several calls of dvisvgm.

I guess, the same caching mechanism can also be used to speed up the processing of vector fonts as it would reduce the number of many FreeType function calls that can take up a significant amount of execution time. So, rather than adding another caching method, I'd first like to check whether the existing code can be extended to vector fonts and if it reduces the runtime sufficiently.

Some further tweaks can also be applied to the formatting output which is usually quite slow when using the STL ostream operators. There are some libraries, like {fmt}, that look promising and might help to speed up outputting the SVG code. If I find some time in the next few weeks, I'll have a closer look into this as well.

ymarco commented 1 year ago

Thank you for taking the time to profile the runtime and for providing the PR.

My pleasure, optimizing code to run faster is one of the things I love the most in programming.

I guess, the same caching mechanism can also be used to speed up the processing of vector fonts as it would reduce the number of many FreeType function calls that can take up a significant amount of execution time. So, rather than adding another caching method, I'd first like to check whether the existing code can be extended to vector fonts and if it reduces the runtime sufficiently.

Sure. I had no idea this other cache existed, hiding under my nose.

Some further tweaks can also be applied to the formatting output which is usually quite slow when using the STL ostream operators. There are some libraries, like {fmt}, that look promising and might help to speed up outputting the SVG code. If I find some time in the next few weeks, I'll have a closer look into this as well.

I never thought about that, might help a little.

The next thing I thought of doing was trying to get down the 131ms conversion time of one fragment, since it's a common case when the user is just finished typing a new fragment. Profiling there shows that a lot of time is spent reading the map file (in my case /var/lib/texmf/fonts/map/dvips/updmap/ps2pk.map which has 36k lines):

I'll try to get a something working next weekend or the one after that, hopefully

mgieseki commented 1 year ago

The next thing I thought of doing was trying to get down the 131ms conversion time of one fragment, since it's a common case when the user is just finished typing a new fragment. Profiling there shows that a lot of time is spent reading the map file (in my case /var/lib/texmf/fonts/map/dvips/updmap/ps2pk.map which has 36k lines).

I'll try to get a something working next weekend or the one after that, hopefully

Yes, that's definitely an area with room for improvement. I suppose, one way to speed up reading the mapping information could be to write the collected FontMap::Entry data of a .map file to a binary file stored in dvisvgm's cache directory. If the time stamps of these files are newer than those of the corresponding .map files, the data can be read in from the binary files much faster because there's no need to parse the .map files again and again.

ymarco commented 1 year ago

Yeah, I thought about this, but I don't know if parsing it from binary would be much better than parsing the .map, which already looks pretty straightforward. We'll see, I guess

mgieseki commented 1 year ago

I guess, it will be much faster if the map line data is stored as fixed data blocks so that the information can be loaded directly without lots of further parsing. In this case there's no need to detect the variant of the map file format (dvips or dvipdfm) and it's not necessary to extract the arbitrarily ordered and mostly optional parameters from a text file, i.e. a lot of checks, text and number parsing could be avoided. I can't estimate how much faster this would be, though. So I'm looking forward to your suggestions.

ymarco commented 1 year ago

Would be also cool to maybe read the entire map file (or cache) file to memory and mess with C++17 string_views to avoid allocating all of those std::strings. Though we wouldn't be able to close the file or release the memory then, and I don't know what's the consensus about using C++17 stuff.

mgieseki commented 1 year ago

Unfortunately, C++17 is not an option for now because some older systems still officially supported by TeX Live don't provide recent compilers. Switching to C++17 would require removing dvisvgm from TeX Live which I'd like to avoid.

mgieseki / dvisvgm

Possible performance improvements #238