GitbookIO / gitbook

The open source frontend for GitBook doc sites
https://www.gitbook.com
GNU General Public License v3.0
27.28k stars 3.89k forks source link

Generating slowly when the file number is large #1497

Closed flyaway1217 closed 8 months ago

flyaway1217 commented 8 years ago

ENV: CLI version: 2.3.0 GitBook version: 3.2.0 sys: MacOS 10.11.6

I have 221 pages in my directory. But gitbook will take two minutes to generate all these 221 pages. It's too slow for me. Is there anything we can do to speed up the generation?

SamyPesse commented 8 years ago

yes, I have planed to investigate performances in the coming weeks. I'm finishing some work on GitBook.com and our editor before.

SamyPesse commented 8 years ago

If you're using the CLI, can you run the build command with --timing argument and paste me the output (not everything but the timing data).

flyaway1217 commented 8 years ago

@SamyPesse the output: info: 7 plugins are installed info: 6 explicitly listed info: loading plugin "highlight"... OK info: loading plugin "search"... OK info: loading plugin "lunr"... OK info: loading plugin "sharing"... OK info: loading plugin "fontsettings"... OK info: loading plugin "theme-default"... OK info: found 221 pages info: found 55 asset files info: >> generation finished with success in 149.9s ! debug: 0.0% of time spent in "call.hook.init" (1 times) : debug: > Total: 0ms | Average: 0ms debug: > Min: 0ms | Max: 0ms debug: --------------------------- debug: 0.0% of time spent in "call.hook.finish:before" (1 times) : debug: > Total: 1ms | Average: 1ms debug: > Min: 1ms | Max: 1ms debug: --------------------------- debug: 0.0% of time spent in "call.hook.config" (1 times) : debug: > Total: 2ms | Average: 2ms debug: > Min: 2ms | Max: 2ms debug: --------------------------- debug: 0.0% of time spent in "parse.listAssets" (1 times) : debug: > Total: 39ms | Average: 39ms debug: > Min: 39ms | Max: 39ms debug: --------------------------- debug: 0.0% of time spent in "call.hook.finish" (1 times) : debug: > Total: 45ms | Average: 45ms debug: > Min: 45ms | Max: 45ms debug: --------------------------- debug: 0.0% of time spent in "plugins.findForBook" (1 times) : debug: > Total: 66ms | Average: 66ms debug: > Min: 66ms | Max: 66ms debug: --------------------------- debug: 0.1% of time spent in "plugin.load" (6 times) : debug: > Total: 175ms | Average: 29ms debug: > Min: 1ms | Max: 124ms debug: --------------------------- debug: 0.1% of time spent in "call.hook.page:before" (221 times) : debug: > Total: 201ms | Average: 1ms debug: > Min: 0ms | Max: 5ms debug: --------------------------- debug: 0.2% of time spent in "parse.listPages" (1 times) : debug: > Total: 238ms | Average: 238ms debug: > Min: 238ms | Max: 238ms debug: --------------------------- debug: 0.3% of time spent in "parse.book" (1 times) : debug: > Total: 383ms | Average: 383ms debug: > Min: 383ms | Max: 383ms debug: --------------------------- debug: 0.3% of time spent in "call.hook.page" (221 times) : debug: > Total: 411ms | Average: 2ms debug: > Min: 0ms | Max: 12ms debug: --------------------------- debug: 5.2% of time spent in "page.generate" (221 times) : debug: > Total: 7.83s | Average: 35ms debug: > Min: 13ms | Max: 148ms debug: --------------------------- debug: 86.7% of time spent in "template.render" (442 times) : debug: > Total: 131.61s | Average: 298ms debug: > Min: 0ms | Max: 1.36s debug: --------------------------- debug: 10.78s spent in non-mesured sections

lwchkg commented 8 years ago

Here's mine using an E-350. Most of them are slower with a notable exception of template.render. Looks like that part has a O(n^2) complexity in speed.

D:\Documents\SpiderOak Hive\gitbook\vbnet_intro>gitbook build --timing
info: 11 plugins are installed
info: 8 explicitly listed
info: loading plugin "sunlight-highlighter"... OK
info: loading plugin "include-codeblock"... OK
info: loading plugin "styles-less"... (node:4376) fs: re-evaluating native module sources is not supported. If you are u
sing the graceful-fs module, please update it to a more recent version.
OK
info: loading plugin "katex"... OK
info: loading plugin "search"... OK
info: loading plugin "lunr"... OK
info: loading plugin "sharing"... OK
info: loading plugin "theme-default"... OK
info: found 21 pages
info: found 42 asset files
info: compile less file:  styles/website.less
warn: "options" property is deprecated, use config.get(key) instead
warn: "options.output" property is deprecated, use "output.root()" instead
info: compile less file:  styles/pdf.less
info: compile less file:  styles/epub.less
info: compile less file:  styles/mobi.less
info: compile less file:  styles/ebook.less
warn: "this.generator" property is deprecated, use "this.output.name" instead
warn: "navigation" property is deprecated
warn: "book" property is deprecated, use "this" directly instead
info: >> generation finished with success in 161.5s !
debug: 0.0% of time spent in "call.hook.finish:before" (1 times) :
debug:     > Total: 3ms | Average: 3ms
debug:     > Min: 3ms | Max: 3ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.config" (1 times) :
debug:     > Total: 10ms | Average: 10ms
debug:     > Min: 10ms | Max: 10ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.finish" (1 times) :
debug:     > Total: 59ms | Average: 59ms
debug:     > Min: 59ms | Max: 59ms
debug: ---------------------------
debug: 0.0% of time spent in "parse.listAssets" (1 times) :
debug:     > Total: 85ms | Average: 85ms
debug:     > Min: 85ms | Max: 85ms
debug: ---------------------------
debug: 0.1% of time spent in "parse.listPages" (1 times) :
debug:     > Total: 153ms | Average: 153ms
debug:     > Min: 153ms | Max: 153ms
debug: ---------------------------
debug: 0.1% of time spent in "call.hook.page:before" (21 times) :
debug:     > Total: 193ms | Average: 9ms
debug:     > Min: 2ms | Max: 60ms
debug: ---------------------------
debug: 0.3% of time spent in "plugins.findForBook" (1 times) :
debug:     > Total: 448ms | Average: 448ms
debug:     > Min: 448ms | Max: 448ms
debug: ---------------------------
debug: 0.3% of time spent in "call.hook.page" (21 times) :
debug:     > Total: 495ms | Average: 24ms
debug:     > Min: 7ms | Max: 86ms
debug: ---------------------------
debug: 0.8% of time spent in "call.hook.init" (1 times) :
debug:     > Total: 1.29s | Average: 1.29s
debug:     > Min: 1.29s | Max: 1.29s
debug: ---------------------------
debug: 0.8% of time spent in "parse.book" (1 times) :
debug:     > Total: 1.40s | Average: 1.40s
debug:     > Min: 1.40s | Max: 1.40s
debug: ---------------------------
debug: 2.8% of time spent in "page.generate" (21 times) :
debug:     > Total: 4.83s | Average: 230ms
debug:     > Min: 80ms | Max: 720ms
debug: ---------------------------
debug: 3.8% of time spent in "plugin.load" (8 times) :
debug:     > Total: 6.47s | Average: 808ms
debug:     > Min: 13ms | Max: 5.00s
debug: ---------------------------
debug: 5.4% of time spent in "template.render" (42 times) :
debug:     > Total: 9.33s | Average: 222ms
debug:     > Min: 4ms | Max: 583ms
debug: ---------------------------
debug: 146.94s spent in non-mesured sections
joshrtay commented 8 years ago

How about support for incremental rebuilds when in development?

lwchkg commented 8 years ago

I don't think incremental build will help. The solution is to identify the O(n^2) part and make most of it O(n) or O(n log(n)).

Anyway devs: wonder if it will be easier to add/hire some extra developers to look into the issue instead of doing that yourself.

nagim commented 8 years ago

Incremental built would be helpful, but not related to this issue indeed. @lwchkg, do you have deep folder and / or ToC structure? I experience similar issue and that's the only thing I can think of.

AlexKasaku commented 8 years ago

171 pages, quite a slow build too:

debug: 0.0% of time spent in "call.hook.config" (1 times) :
debug:     > Total: 3ms | Average: 3ms
debug:     > Min: 3ms | Max: 3ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.finish:before" (1 times) :
debug:     > Total: 4ms | Average: 4ms
debug:     > Min: 4ms | Max: 4ms
debug: ---------------------------
debug: 0.0% of time spent in "parse.listAssets" (1 times) :
debug:     > Total: 63ms | Average: 63ms
debug:     > Min: 63ms | Max: 63ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.finish" (1 times) :
debug:     > Total: 70ms | Average: 70ms
debug:     > Min: 70ms | Max: 70ms
debug: ---------------------------
debug: 0.0% of time spent in "plugins.findForBook" (1 times) :
debug:     > Total: 71ms | Average: 71ms
debug:     > Min: 71ms | Max: 71ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.init" (1 times) :
debug:     > Total: 110ms | Average: 110ms
debug:     > Min: 110ms | Max: 110ms
debug: ---------------------------
debug: 0.0% of time spent in "parse.listPages" (1 times) :
debug:     > Total: 223ms | Average: 223ms
debug:     > Min: 223ms | Max: 223ms
debug: ---------------------------
debug: 0.0% of time spent in "parse.book" (1 times) :
debug:     > Total: 280ms | Average: 280ms
debug:     > Min: 280ms | Max: 280ms
debug: ---------------------------
debug: 0.1% of time spent in "plugin.load" (8 times) :
debug:     > Total: 355ms | Average: 44ms
debug:     > Min: 2ms | Max: 173ms
debug: ---------------------------
debug: 0.1% of time spent in "call.hook.page:before" (171 times) :
debug:     > Total: 401ms | Average: 2ms
debug:     > Min: 1ms | Max: 29ms
debug: ---------------------------
debug: 0.3% of time spent in "call.hook.page" (171 times) :
debug:     > Total: 2.09s | Average: 12ms
debug:     > Min: 2ms | Max: 303ms
debug: ---------------------------
debug: 1.0% of time spent in "page.generate" (171 times) :
debug:     > Total: 6.26s | Average: 37ms
debug:     > Min: 16ms | Max: 427ms
debug: ---------------------------
debug: 9.1% of time spent in "template.render" (342 times) :
debug:     > Total: 58.85s | Average: 172ms
debug:     > Min: 0ms | Max: 1.18s
debug: ---------------------------
debug: 574.95s spent in non-mesured sections

Most of the time is non-measured.

AlexKasaku commented 8 years ago

Ok, so this might be helpful.

In the gitbook that we are generating (and my last comment provided timings for), the Glossary is particularly large at 123 entries. I just removed most of these, leaving 4, and the build is much faster:

debug: 0.0% of time spent in "call.hook.config" (1 times) :
debug:     > Total: 1ms | Average: 1ms
debug:     > Min: 1ms | Max: 1ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.finish:before" (1 times) :
debug:     > Total: 2ms | Average: 2ms
debug:     > Min: 2ms | Max: 2ms
debug: ---------------------------
debug: 0.0% of time spent in "parse.listAssets" (1 times) :
debug:     > Total: 30ms | Average: 30ms
debug:     > Min: 30ms | Max: 30ms
debug: ---------------------------
debug: 0.1% of time spent in "call.hook.init" (1 times) :
debug:     > Total: 50ms | Average: 50ms
debug:     > Min: 50ms | Max: 50ms
debug: ---------------------------
debug: 0.1% of time spent in "call.hook.finish" (1 times) :
debug:     > Total: 56ms | Average: 56ms
debug:     > Min: 56ms | Max: 56ms
debug: ---------------------------
debug: 0.1% of time spent in "plugins.findForBook" (1 times) :
debug:     > Total: 63ms | Average: 63ms
debug:     > Min: 63ms | Max: 63ms
debug: ---------------------------
debug: 0.2% of time spent in "parse.listPages" (1 times) :
debug:     > Total: 171ms | Average: 171ms
debug:     > Min: 171ms | Max: 171ms
debug: ---------------------------
debug: 0.2% of time spent in "parse.book" (1 times) :
debug:     > Total: 187ms | Average: 187ms
debug:     > Min: 187ms | Max: 187ms
debug: ---------------------------
debug: 0.3% of time spent in "plugin.load" (8 times) :
debug:     > Total: 246ms | Average: 31ms
debug:     > Min: 1ms | Max: 118ms
debug: ---------------------------
debug: 0.4% of time spent in "call.hook.page:before" (171 times) :
debug:     > Total: 322ms | Average: 2ms
debug:     > Min: 1ms | Max: 7ms
debug: ---------------------------
debug: 1.9% of time spent in "call.hook.page" (171 times) :
debug:     > Total: 1.55s | Average: 9ms
debug:     > Min: 2ms | Max: 53ms
debug: ---------------------------
debug: 5.9% of time spent in "page.generate" (171 times) :
debug:     > Total: 4.70s | Average: 27ms
debug:     > Min: 14ms | Max: 72ms
debug: ---------------------------
debug: 63.2% of time spent in "template.render" (342 times) :
debug:     > Total: 50.64s | Average: 148ms
debug:     > Min: 0ms | Max: 551ms
debug: ---------------------------
debug: 22.11s spent in non-mesured sections

So it looks like the glossary highlighting could be an avenue for you to investigate when it comes to performance.

nagim commented 8 years ago

This might be it. I have more than 400 glossary items and for me, it takes more than 4 hours to generate, compared to the ~20 minutes with v2.

nagim commented 8 years ago

I am amazed how much impact it has on performance. My big repo generated in less than 2 minutes (!) without the glossary, instead of the usual 12-15 with v2.6.7.

Really need that optional case sensitivity for the glossary items. Now it's insensitive and it results one of my glossary item to be highlighted ~3000 times instead of 40. It's not only slowing the builds, but incorrect, too.

Template rendering is the next big performance issue in v3.x.x. The same build that took less than 2 minutes in v2, took ~50 minutes in v3 because of the template rendering.

debug: 0.0% of time spent in "call.hook.finish:before" (1 times) :
debug:     > Total: 1ms | Average: 1ms
debug:     > Min: 1ms | Max: 1ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.config" (1 times) :
debug:     > Total: 1ms | Average: 1ms
debug:     > Min: 1ms | Max: 1ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.init" (1 times) :
debug:     > Total: 4ms | Average: 4ms
debug:     > Min: 4ms | Max: 4ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.finish" (1 times) :
debug:     > Total: 378ms | Average: 378ms
debug:     > Min: 378ms | Max: 378ms
debug: ---------------------------
debug: 0.0% of time spent in "plugins.findForBook" (1 times) :
debug:     > Total: 476ms | Average: 476ms
debug:     > Min: 476ms | Max: 476ms
debug: ---------------------------
debug: 0.0% of time spent in "call.hook.page:before" (468 times) :
debug:     > Total: 753ms | Average: 2ms
debug:     > Min: 0ms | Max: 17ms
debug: ---------------------------
debug: 0.0% of time spent in "parse.listAssets" (1 times) :
debug:     > Total: 1.42s | Average: 1.42s
debug:     > Min: 1.42s | Max: 1.42s
debug: ---------------------------
debug: 0.0% of time spent in "plugin.load" (15 times) :
debug:     > Total: 1.43s | Average: 95ms
debug:     > Min: 4ms | Max: 703ms
debug: ---------------------------
debug: 0.1% of time spent in "parse.listPages" (1 times) :
debug:     > Total: 2.24s | Average: 2.24s
debug:     > Min: 2.24s | Max: 2.24s
debug: ---------------------------
debug: 0.1% of time spent in "parse.book" (1 times) :
debug:     > Total: 2.75s | Average: 2.75s
debug:     > Min: 2.75s | Max: 2.75s
debug: ---------------------------
debug: 0.5% of time spent in "call.hook.page" (468 times) :
debug:     > Total: 13.56s | Average: 29ms
debug:     > Min: 0ms | Max: 438ms
debug: ---------------------------
debug: 2.5% of time spent in "page.generate" (468 times) :
debug:     > Total: 73.17s | Average: 156ms
debug:     > Min: 57ms | Max: 1.38s
debug: ---------------------------
debug: 93.1% of time spent in "template.render" (936 times) :
debug:     > Total: 2767.20s | Average: 2.96s
debug:     > Min: 0ms | Max: 11.70s
debug: ---------------------------
debug: 108.20s spent in non-mesured sections
lwchkg commented 8 years ago

This might be it. I have more than 400 glossary items and for me, it takes more than 4 hours to generate, compared to the ~20 minutes with v2.

Looks like @SamyPesse needs to implement a decent string matching algorithm with Glossary. With something like Aho-Corasick algorithm matching 400 terms is not much more expensive than matching one term.

nagim commented 8 years ago

@lwchkg I think it's more closely related to the total number of hits in the texts (and applying classes, stlyes, etc.) rather than the sole number of entries.

As I mentioned, one of my items has ~3.000 hits alone (because of case insensitivity). The >400 entries of mine results in total >15.000 hits in the texts.

kogakure commented 7 years ago

I came across this issue, because my Gitbook rendering got slow in the last few days. Deactivating the Glossary brought the rendering time down from 41 seconds to 3 seconds. I have ~100 glossary terms, but a lot of repeated terms per article. In my book I repeat some words over and over (sometimes 50+). It looks awkward seeing every mention of this word underlined, and it’s unnecessary. If somebody reads a page it should be clear after the first mention, what the word means.

My suggestion is to only highlight one word per page per entry. This could speed up the build process a lot.

istommao commented 7 years ago

CLI version: 2.3.0 GitBook version: 3.2.2

info: >> generation finished with success in 17.1s !

debug: 0.0% of time spent in "call.hook.finish:before" (1 times) : debug: > Total: 0ms | Average: 0ms debug: > Min: 0ms | Max: 0ms debug: --------------------------- debug: 0.0% of time spent in "call.hook.init" (1 times) : debug: > Total: 0ms | Average: 0ms debug: > Min: 0ms | Max: 0ms debug: --------------------------- debug: 0.0% of time spent in "call.hook.config" (1 times) : debug: > Total: 3ms | Average: 3ms debug: > Min: 3ms | Max: 3ms debug: --------------------------- debug: 0.1% of time spent in "parse.listAssets" (1 times) : debug: > Total: 19ms | Average: 19ms debug: > Min: 19ms | Max: 19ms debug: --------------------------- debug: 0.1% of time spent in "call.hook.finish" (1 times) : debug: > Total: 22ms | Average: 22ms debug: > Min: 22ms | Max: 22ms debug: --------------------------- debug: 0.3% of time spent in "call.hook.page:before" (97 times) : debug: > Total: 50ms | Average: 1ms debug: > Min: 0ms | Max: 3ms debug: --------------------------- debug: 0.4% of time spent in "plugins.findForBook" (1 times) : debug: > Total: 70ms | Average: 70ms debug: > Min: 70ms | Max: 70ms debug: --------------------------- debug: 0.7% of time spent in "parse.listPages" (1 times) : debug: > Total: 131ms | Average: 131ms debug: > Min: 131ms | Max: 131ms debug: --------------------------- debug: 1.1% of time spent in "plugin.load" (6 times) : debug: > Total: 200ms | Average: 33ms debug: > Min: 2ms | Max: 138ms debug: --------------------------- debug: 1.2% of time spent in "parse.book" (1 times) : debug: > Total: 216ms | Average: 216ms debug: > Min: 216ms | Max: 216ms debug: --------------------------- debug: 1.5% of time spent in "call.hook.page" (97 times) : debug: > Total: 288ms | Average: 3ms debug: > Min: 1ms | Max: 9ms debug: --------------------------- debug: 7.1% of time spent in "page.generate" (97 times) : debug: > Total: 1.33s | Average: 14ms debug: > Min: 7ms | Max: 59ms debug: --------------------------- debug: 68.4% of time spent in "template.render" (194 times) : debug: > Total: 12.82s | Average: 66ms debug: > Min: 0ms | Max: 305ms debug: --------------------------- debug: 3.58s spent in non-mesured sections

erlend-sh commented 7 years ago

It's a long shot but, any chance you could use Rust (with Neon) to address performance bottlenecks? The mdBook project by @azerupi could serve as a great point of reference.

leithlin commented 7 years ago

i`m writing a book which has 900 pages, i nearly takes me about 1 hour to generate it .Is there anyone can tell me how to make it faster?

baijifeilong commented 6 years ago

I have 3000 pages to build. 5 minutes per page. So I need about 10 days to build it ......

Lessica commented 6 years ago

Building a GitBook costs more than building an operating system...

Lessica commented 6 years ago

I do not know why it cannot process pages in linear complexity...

syntholly commented 6 years ago

Has there been an update on this issue elsewhere? ~300 pages takes 10 minute an edit. It's killing me..

harttle commented 6 years ago

Since most of the build time is consumed by templating, can we provide an option like disableTemplate: true.

I know the SUMMARY.md, maybe plugins as well, depends on templating. But for most of my projects, all the markdown files wanted is a markdown engine, not nunjucks.

lijiaocn commented 5 years ago

terrible