GitbookIO / gitbook-convert

CLI to convert an existing document to a GitBook.
102 stars 19 forks source link

"process out of memory" when trying to convert a large document #15

Open istepheny opened 8 years ago

istepheny commented 8 years ago

Hello there, I was trying to convert a xml file but getting a process out of memory error, the size of the xml file is about 40M or 1,000,000+ rows but with a well Docbook version 5 format, the process crashed at stage of Writing docbook.css for set(index).

Environment: OS: Fedora 23 64bit RAM: 8G node: v4.4.4 npm: 3.9.0

$ gitbook-convert -d manual.xml

Creating export folder...
Creating assets folder...
Creating summary file...
Done.
Converting Docbook to HTML...
Writing docbook.css for set(index)

<--- Last few GCs --->
  135481 ms: Scavenge 1396.9 (1454.7) -> 1396.9 (1454.7) MB, 0.5 / 0 ms (+ 2.0 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
  136728 ms: Mark-sweep 1396.9 (1454.7) -> 1396.9 (1454.7) MB, 1246.7 / 0 ms (+ 3.0 ms in 2 steps since start of marking, biggest step 2.0 ms) [last resort gc].
  138164 ms: Mark-sweep 1396.9 (1454.7) -> 1396.9 (1454.7) MB, 1436.1 / 0 ms [last resort gc].

<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0xb3cbc7b4629 <JS Object>
    2: renderTag [/usr/lib/node_modules/gitbook-convert/node_modules/cheerio/node_modules/dom-serializer/index.js:~127] [pc=0x2c0a7b2f84a4] (this=0x2e6add8098d9 <JS Global Object>,elem=0x12680426519 <an Object with map 0xa610b953f11>,opts=0x22f6e6b7ff99 <an Object with map 0xa610b952a19>)
    3: /* anonymous */ [/usr/lib/node_modules/gitbook-convert/node_modules/cheerio/node_modules/dom-serializ...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

I was wondering whether the file is too big so the process workload has exceeded the memory limit of V8 Engine, so I was trying to run node with flags "--max_old_space_size=6000 --max_executable_size=6000" to get the memory limit higher, but it still didn't work and finally crashed at stage of Parsing chapters.

$ node --max_old_space_size=6000 --max_executable_size=6000 /lib/node_modules/gitbook-convert/bin/gitbook-convert.js -d manual.xml

Creating export folder...
Creating assets folder...
Creating summary file...
Done.
Converting Docbook to HTML...

Writing docbook.css for set(index)

Extracting footnotes...
Parsing chapters...

<--- Last few GCs --->
  976222 ms: Scavenge 5996.5 (6080.2) -> 5996.5 (6080.2) MB, 1.3 / 0 ms (+ 1.6 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
  978783 ms: Mark-sweep 5996.5 (6080.2) -> 5996.5 (6080.2) MB, 2560.4 / 0 ms (+ 2.5 ms in 2 steps since start of marking, biggest step 1.6 ms) [last resort gc].
  981338 ms: Mark-sweep 5996.5 (6080.2) -> 5996.5 (6080.2) MB, 2554.9 / 0 ms [last resort gc].

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x1093594b4629 <JS Object>
    2: renderTag(aka renderTag) [/usr/lib/node_modules/gitbook-convert/node_modules/cheerio/node_modules/dom-serializer/index.js:~127] [pc=0x3433dc863764] (this=0x1093594041b9 <undefined>,elem=0x27d6ae1be571 <an Object with map 0x2520b0243451>,opts=0x3312b08436b9 <an Object with map 0x2520b02452e9>)
    3: /* anonymous */ [/usr/lib/node_modules/gitbook-convert/node_modules/cheerio/node_modules/...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

In order to confirm there is no format error in my xml document, I picked out a chapter form the document and this piece of file was converted successfully,

$ gitbook-convert -d test.xml 
Creating export folder...
Creating assets folder...
Creating summary file...
Done.
Converting Docbook to HTML...
Writing docbook.css for set(index)

Extracting footnotes...
Parsing chapters...
Processing chapters...
Converting chapters to markdown...
Writing summary...
Writing file: /opt/export/README.md
Writing file: /opt/export/manual/README.md
Writing file: /opt/export/manual/preface/README.md
Writing file: /opt/export/manual/preface/authors_and_contributors/README.md
Writing file: /opt/export/manual/preface/authors_and_contributors/authors_and_editors.md
Writing file: /opt/export/getting_started/README.md
Writing file: /opt/export/getting_started/1_introduction/README.md
...
...
Done.

It seems like I just have no enough RAM to convert the whole document. Are there any solutions on this issue without buying a new large RAM? :joy: