CSS: cache parse results

bbshelper commented 1 year ago

Stylesheets included in each DocFragment are parsed individually, even if they point to the same file. Depending on the number of CSS files, their size, and number of DocFragments, CSS parsing accounts for several percentage of a full LVDocView::Render() time, or even more in extreme cases. By introducing a parsing cache, the time spent in LVImportStylesheetParser::Parse is now negligible.

Closes #491.

This change is

NiLuJe commented 1 year ago

Looks simple enough ;).

poire-z commented 1 year ago

I've been testing some other optimisations to the initial opening/loading of a document. On a big EPUB with 3000 DocFragements, on my emulator. Loading times (rendering stays at 5s):

17.1 seconds before any optimisation
14.4 seconds when not using (in epubfmt.cpp) 3000 times doc->nodeFromXPath(lString32("package/manifest/item[") << fmt::decimal(i) << "]") but just iterating the children of "package/manifest/"
13.9 seconds by additionally enhancing LVContainer to use a LVHashTable(name, idx) instead of iterating all the indexes and comparing strings
11.5 seconds by additionally including this PR so, it's nice :)

poire-z commented 1 year ago

I intend to make more performance patches, maybe we can discuss then on a case by case basis.

Sure, feel free to discuss your thoughts, intutions and experiments about where we could do better. You seem confortable with performance analysis tools, so you may see stuff where I never looked at :) But my feeling everytime I wonder if adding a if (check) could add some performance penalty is that what I do in lvrend/lvtext is peanuts vs. what's happening in FreeType/Harfbuzz when rendering a glyph or shaping text :) (Would you be able to quantify that ? The % of time spent in these libraries vs. the time spend in crengine own code?)

So, it feels the easiest stuff we could do is by using caches (like you did here) - and crengine already does a lot of that (caching glyphs, caching formatted paragraphs...).

BTW, skimming through LVFormatter, there doesn't seem to be much low hanging fruit :(

Indeed. I remember I added some light_formatting toggle to skip some stuff that we don't really need when doing a full rendering. https://github.com/koreader/crengine/blob/4de8f9d030ab001a6a3e5da654f6af7a20ad12d2/crengine/src/lvtextfm.cpp#L3002-L3016 I think there is still some bits that could be skipped - but I haven't yet gone at thinking about it all... and I have no idea how much cpu time it would save, and if its worth the time and risk. (I don't expect you to go at that :) it feels there are many cases to think about - and so, to become familiar with beforehand - like floats, inlineboxes, bidi/RTL... and be sure they won't be impacted...).

Btw, I have 2 optimisations I just have to finish and cleanup - so no need to work on them if you spotted them :):

some optimisation to LVContainer to not have to iterate all items and compare their names - but use a hash<name, index>
limit the uses of the slow doc->nodeFromXPath() in epubfmt.cpp , and iterate children instead of calling this many times

I'll merge this PR later today - and may be I'll wait a few days to bump it into koreader frontend, hoping I can finish my small optmisations to bring them in with the bump.

bbshelper commented 1 year ago

But my feeling everytime I wonder if adding a if (check) could add some performance penalty is that what I do in lvrend/lvtext is peanuts vs. what's happening in FreeType/Harfbuzz when rendering a glyph or shaping text :) (Would you be able to quantify that ? The % of time spent in these libraries vs. the time spend in crengine own code?)

Unless the branching instruction is added to a hot loop, I don't expect a measurable performance impact to the whole render process. Profiling is a tricky task. By far I don't see many freetype/harfbuzz functions on top of the list, but I won't jump to any conclusion until I gain more knowledge about the code. We don't actually render (besides measuring) all the glyphs upon opening a new epub, do we?

I think there is still some bits that could be skipped - but I haven't yet gone at thinking about it all... and I have no idea how much cpu time it would save, and if its worth the time and risk. (I don't expect you to go at that :) it feels there are many cases to think about - and so, to become familiar with beforehand - like floats, inlineboxes, bidi/RTL... and be sure they won't be impacted...).

That's very much non-trivial. I'm wondering if we have anything like an automated test?

Btw, I have 2 optimisations I just have to finish and cleanup - so no need to work on them if you spotted them :):

Thanks for the heads-up. I definitely need more time before any PR :)

Frenzie commented 1 year ago

That's very much non-trivial. I'm wondering if we have anything like an automated test?

We do to some limited extent, but perhaps too far removed from here to be very useful.

For example, see things like https://github.com/koreader/koreader/blob/0f52499d3748c63a7174515c17e2b8f2fa205c77/spec/unit/readerrolling_spec.lua

See stuff like https://www.w3.org/Style/CSS/Test/CSS2.1/20100127/ for some of the type of tests I'd ideally like to run, but I never got around to doing anything like that yet.

poire-z commented 1 year ago

We don't actually render (besides measuring) all the glyphs upon opening a new epub, do we?

No, we just measure glyphs and cache their width/advance/lsb/rsb. But when using HarfBuzz (Kerning "best" in KOreader's bottom menu), we do give all the text nodes's text to Harfbuzz to shape all of them. At the time we added it, "best" took 25-50% more time rendering than "fast" (only FreeType and glyph advance cached). I don't really know how Harfbuzz works, but it must do a lot of loops and font tables lookups when shaping a text (while we must do 3 to 5 iterations along its characters). Even currently, the difference between "fast" and "best" should be noticable.

That's very much non-trivial. I'm wondering if we have anything like an automated test?

There is a unit test in frontend that expects to see a certain number of pages for a book. So, when we really mess things up, it will notice it :) But it more often failed when we did fix things or implement page-break correctly, and the correct new rendering did change the page number. So, we had to fix the test more often than the code. But there's so many situations in the EPUB wild that it would be hard to consider them all - or even isolate and test individual bits. Anyway, not interested spending time on that. I have hundreds of test-this-or-that.html files that I have created each time I had to fix or implement something - and later add bits in them to test other things, or reuse some to validate other things (because too lazy to create a new one). After some years, what I initially wrote "this should be red" is no more red because I ended up using it to test another condition and left it in a state where the condition is false and the text should really not be red... :) But anyway, often, I "feel" this change may have some impact on another thing, and I remember I had some test files for that another thing (and often, finding it among these hundreds files is hard), that I go test with (which I actually did for this PR, found some test-headstyles.html/epub.) Rationalizing all that would be a tough and long work, some not interesting work doable only by highly paid (because not fun) Q&A teams that I would not want to be part of :) And I'm not even sure it would help much.

poire-z commented 1 year ago

Related to https://github.com/koreader/crengine/pull/493#discussion_r1005380953 where we talked about clearing the stylesheet cache:

We currently clear the stylesheet cache when we are done rerendering. But we forgot to clean it after the initial document loading (when no document cache) - it's a different codepath as we apply styles as we build the DOM, and nowhere after that we do _styleSheetCache.clear() - so these cached stylesheets are used on the next rerendering (full or partial).

Noticed on a book with body { font-family: Foobar, sans-serif } - and going to uncheck Font-faimly fonts > [x] sans-serif: MySansSerifFont, it had no effect - as indeed, the way we support that is by doing substitution of "sans-serif" with the font name at CSS parsing time - and reusing the cache, this had no effect. I guess this will solve it (found no real other place/existing branch where this could go :/):

--- a/crengine/src/lvtinydom.cpp
+++ b/crengine/src/lvtinydom.cpp
@@ -5461,6 +5461,10 @@ bool ldomDocument::render( LVRendPageList * pages, LVDocViewCallback * callback,
 //        CRLog::trace("reusing existing format data...");
 //    }

+    if ( !_rendered ) {
+        // We have loaded the document and applied styles: drop this cache
+        _styleSheetCache.clear();
+    }
     bool was_just_rendered_from_cache = _just_rendered_from_cache; // cleared by checkRenderContext()
     if ( !checkRenderContext() ) {
         // Remember these, in case we later do partial rerenderings

Frenzie commented 1 year ago

It does mostly make sense there. :-)

koreader / crengine

CSS: cache parse results #493