kiwix / kiwix-js

Fully portable & lightweight ZIM reader in Javascript
https://www.kiwix.org/
GNU General Public License v3.0
300 stars 124 forks source link

Serious performance regression on rendering the index page of stackoverflow.com_eng_all_2017-05.zim #381

Closed mossroy closed 6 years ago

mossroy commented 6 years ago

The index page (that is automatically loaded when you choose this ZIM file) is very heavy. In version 2.2, it already took some time to load it. But now, applying the extra CSS (that were not loaded in 2.2) takes almost 10 minutes on my computer, with the browser eating a whole CPU all this time.

It's not a bug, but the fact that we are now able to apply all CSS styles has this consequence. And it makes this ZIM file unusable.

I did not have the time to investigate more on that. It would be worth testing with the desktop version

Jaifroid commented 6 years ago

I'll see if I can test a bit. Are you sure it's CSS and not SVGs causing the issue? The latter bring every browser to its knees with our current decompression engine. Anyway, I'll take a look.

mossroy commented 6 years ago

I don't think so but not 100% sure. It might be the same on a smaller zim like askubuntu, but I did not check.

Jaifroid commented 6 years ago

Hmm, I can't really test that ZIM because it is 52GB and my home connection would be overwhelmed, or if I torrent it it'll take a few days. I accidentally downloaded engineering.stackexchange.com_en_all_2018-03.zim, which has no problem. Then I downloaded es.stackoverflow.com_es_all_2018-03.zim, because it is a stackoverflow type, but it also has no problem. I'll try askubuntu.

Jaifroid commented 6 years ago

I can't reproduce this with askubuntu.com_en_all_2017-10.zim on Edge or Firefox, using the master branch. Both browsers load it very fast.

If you have a moment, put a break at the very beginning of the displayArticleInForm function (inside it), and inspect the html in the htmlArticle variable. How many stylesheets are in the header of the html? If they are no more than five or six, it can't be the number of CSS that is causing the issue. Search for any .svg files with ZIM URLs, because they could definitely explain the problem.

Another explanation might be one malformed stylesheet or one that is stored without the terminating byte, so the decompressor gets into a loop until it reaches the end of the ZIM file... You could find the culprit by breaking inside loadCSSJQuery, on line 965 of current master (second line below). Also add a break on the third line below, as this will catch both before decompression and after, so you should have an idea of which ones decompress quickly.

return selectedArchive.readUtf8File(dirEntry,
         function (fileDirEntry, content) {
             var fullUrl = fileDirEntry.namespace + "/" + fileDirEntry.url;
mossroy commented 6 years ago

With stackoverflow.com_eng_all_2017-05.zim, it takes around 9 minutes on Chromium, but "only" around 1 minute on Firefox. In the dev tools of both browsers, they say all this time is spent in "Recalculate style". In ServiceWorker mode under Chromium, it takes the exact same time. As it's the welcome page with no link from another page to it, it's the same displayArticleInForm function that loads the HTML. But the CSS are loaded by the ServiceWorker. With askubuntu.com_en_all_2017-06.zim, the "Recalculate style" only takes 2 seconds.

I tried with desktop kiwix : the same welcome page of stackoverflow only takes around 15 seconds to be displayed (on the same machine).

There are 3 CSS in this page : -/static/bootstrap/css/bootstrap.min.css, -/static/bootstrap/css/bootstrap-theme.min.css and -/static/main.css . And no SVG. Based on the dev tools of both browsers, the time is not spent in decompression, but in applying the CSS stylesheets. main.css is the last one read before the slowdown, so I suspect it comes from it. Here is the file (I had to rename it to .txt to have github accept it for upload here) : main.css.txt

mossroy commented 6 years ago

Before the slowdown, all items are above each other (in a vertical list). After the slowdown, they are in rectangles with borders, which seem to come from the main.css style below :

body.tags > div.container ul > li {
    display: inline-block;
    padding: 15px;
    margin: 5px;
    border: 1px solid hsl(208, 56%, 46%);
}

If I modify this CSS class in dev tools, it's quite fast to update, so I don't think it comes from this specific class.

mossroy commented 6 years ago

I'll continue to investigate but it might take some time because I don't always have the relevant test context with such a big ZIM file.

Jaifroid commented 6 years ago

Strange. Askubuntu also contains the same three CSS files. If the old layout was very complicated, I guess it's possible main.css might depend on the rules in the two bootstrap files being in place, so that if they are added in after main, they cause multiple re-draws?

It's odd that this issue has surfaced with v2.3. If it's a layout issue, I can't think of anything we've changed that would affect speed of layout. We're effectively using the same insertion method as before (jQuery's .html() function resolves to elem.innerHTML = value; on systems that support .innerHTML, which yours clearly does). We're using the same method for inserting stylesheets in jQuery mode (and that wouldn't affect Service Worker anyway). ServiceWorker has always loaded CSS as binary, and that hasn't changed. The code to open collapsed blocks shouldn't match against anything in Stackexchange ZIMs.

I would suggest making a branch from commit https://github.com/kiwix/kiwix-js/commit/6bf87b26d32491d1e9e310fec8a6dac1eff10905, seeing if the problem is there, and if not, move up through the commits, until you find at what specific point the issue occurs for the first time. Or move backwards if the problem is in that commit already.

mossroy commented 6 years ago

I think that this CSS was not injected at all in version 2.2 (I'll check). I think the fix introduced by https://github.com/kiwix/kiwix-js/pull/354 allowed this CSS to be injected, and somehow has this consequence. I'll also check by testing the code before and after this commit.

If this is confirmed, I don't understand why the same (relatively simple) CSS and the same HTML is so slow on a recent browser, where it is way faster on the very old embedded gecko engine of desktop kiwix.

Jaifroid commented 6 years ago

OK. Might be worth checking if we're moving something that should be in the <html …> tag into the <body …> tag. Unlikely, but we know it's a small potential hole in our current code. However, if the document.write() method produced the same results, then this is not the issue.

Two last resorts come to mind:

  1. Prevent injection of the CSS into the document until we have the content of all the CSS files, and inject them all at once --> aim is to minimize screen redraws. This would work in JQuery mode, but I guess we could also not reply to the Service Worker's postmessage until all the files are available. It wouldn't really affect user experience for "normal" ZIM archives, because as you've probably noticed, the CSS get injected quite close to each other anyway.

  2. Ultimate (and simpler) last resort, if the error has no solution --> add an option in the interface to prevent extraction/injection of any CSS, with the option stored in a cookie... Not great, but better than locking up browsers for 10 minutes... Note that we'll probably have to have an option to turn off JavaScript for old/slow browsers, once (if) we enable it, so this wouldn't be completely out of line.

mossroy commented 6 years ago

I confirm that this issue starts in commit id 0faae342eeb9856b023e7dcc7b952114adfa7311. On master, commenting the line that adds classes on the body tag does not solve the issue (but that was a good idea, and gave me ideas to investigate below).

In fact, in 2.2, the main.css stylesheet was correctly injected, but some styles were not applied because the body tag does not have the attribute class="tags". This was due to the way we were injecting the html (which contains the string <body class="tags">) : we were putting everything inside the already-existing body tag. Now that we correctly inject the html, the styles are properly applied. In this same 2.2 version, I manage to reproduce the issue if I manually add the class="tags" attribute using the dev tools of Chrome (after the index is displayed) : it takes almost 10 minutes to be applied.

mossroy commented 6 years ago

I've made some progress. I think the behavior we have is due to the fact that the CSS styles are applied after the HTML is parsed, and the DOM is generated. I did a quick-and-dirty test in ServiceWorker mode in https://github.com/kiwix/kiwix-js/tree/tests-to-fix-CSS-slowness (based on master) : in function readArticle, instead of reading the article content, and calling the displayArticleInForm (like we need to do in jQuery mode), I've simply set the iframe src to the article URL (letting the ServiceWorker do the job). It solves the issue in ServiceWorker mode : it takes only 7 seconds on Chromium, and 5 seconds on Firefox.

I suppose the browsers are smart enough to generate the DOM and apply the CSS styles at the same time. But, in jQuery mode, we first make the browser generate the DOM, then make it apply the CSS styles.

Jaifroid commented 6 years ago

Ah, that makes sense now. I think it's a lot better to let Service Worker run the whole process in Service Worker mode -- it's always been confusing that we start up Service Worker mode in that hybrid way, since we don't need to do any processing on the DOM. It sounds like a clear candidate for a PR.

Regarding JQuery mode, it seems we have three possibilities:

  1. Preload stylesheets and inject them into the raw HTML before we inject the HTML into the DOM --> this would impact on the user experience for the first page load, until the CSS is in the cache. Once in the cache, there would be no impact on user experience.

  2. Load a basic version of the DOM with no CSS, wait till the CSS is available, and inject it all at once --> this may or may not work: it should minimize page re-writes, but a lot may depend on the order in which the browser encounters the CSS.

  3. A hybrid of both systems mentioned above: load a basic representation of the DOM with no CSS for the user to look at, inject CSS into the htmlArticle variable instead of into the DOM, and when ready, re-inject htmlArticle into the iframe (effectively a kind of page reload). This in fact would be a basic version of what React does with its virtual DOM (React diffs the DOM and its virtual copy, and then displays only the difference, to minimize page re-draws). It may be possible to inject just the <head>…</head> and avoid having to re-attach event listeners. This would need some experimentation.

I quite like the idea of option 3, but we should probably try option 2 first (option 2 is a step on the way to option 3 anyway). I have some code, based on master, for a persistent cache using indexedDB and/or localStorage that could complement such approaches (assets are cached per-ZIM).

mossroy commented 6 years ago

Yes, I agree for the ServiceWorker mode : I opened #382 for that.

Thanks for your proposals for the jQuery mode. These are good ideas.

  1. Do you mean inlining the CSS inside the HTML string, before injecting it in the iframe? It has good chances to work, and might not be very difficult to implement. It's considered a bad practice to inline CSS styles, at least for performance reasons, but, in our case, it should improve performance. We'd need to check it does not reduce performance significantly in other cases
  2. Do you mean for example injecting only the head tag (with an empty body), then the CSS styles referenced, then the body tag? There's a chance it might work, but only if the CSS files are declared in the head (or any other portion of the HTML we might decide to inject first). I suppose it's a bit more difficult to implement than option 1
  3. I'm not sure I understood exactly

I would suggest to try the easiest option first, at least to check it really solves the issue.

NB : let's hope ServiceWorkers will be soon widely available, because the jQuery mode forces us to do more and more dirty things like that.

Jaifroid commented 6 years ago

Regarding 1, I mean what we currently do (replacing all <link...> tags with <style>…</style> blocks), but in the raw HTML string before we inject into the DOM. It would be preferable to replace the link URLs with blob: URLs to avoid the inline issue you mention. This indeed works in most browsers (I do it in Kiwix JS Windows), but the CSP of FFOS prevents it working for stylesheets (unlike for images).

The main issue with this method is that the user could be stuck looking at a blank screen, or the spinner, longer than currently (while the CSS decompresses). That's why I suggested 2 and 3, which are just variations of 1, but showing the unstyled HTML first (more or less as we currently do), and then two different methods of injecting the CSS in one go at later points.

As you say, it would be easiest to try 1 first, and then experiment.

The trouble with Service Worker is that I have yet to see it work from the file.// protocol in any browser.

Jaifroid commented 6 years ago

OK, I've tested preloading stylesheets with this ZIM, and it makes a big difference on Firefox, Chromium and Edge. We get to a "useable" page relatively quickly on both (I say "useable" with quotation marks because most of the tags don't work). EDIT: This is because the page had stalled during rendering. The tags are useable.

Jaifroid commented 6 years ago

Apologies, I didn't realize I was looking at an incomplete render of the home page of this ZIM, hence my comment about gazillions of useless tags. They're not useless. I've deleted that comment.

mossroy commented 6 years ago

Cool, that's good news! Could you create a PR?

I'll try to fix #382 on my side (for ServiceWorker mode). Regarding Service Workers, it's true that they don't work in file:// protocol (most browsers consider it as unsecure). It's annoying but should only affect development (you can run any local webserver. Some IDEs provide one for testing). Except if UWA apps use the file:// protocol?

mossroy commented 6 years ago

Fixed by #387