GitbookIO / gitbook

The open source frontend for GitBook doc sites
https://www.gitbook.com
GNU General Public License v3.0
27.14k stars 3.87k forks source link

Change PDF generation engine to either wkhtmltopdf or phantom.js #1470

Closed lwchkg closed 7 months ago

lwchkg commented 8 years ago

The current PDF generation engine, calibre, has too much limitations, e.g.

Proposed solutions

I've also heard a new product called WeasyPrint, but it does not support @font-face yet, so it is a turn-off.

Please discuss. Thanks!

lwchkg commented 8 years ago

Since I wanted this badly, I've coded a PDF generator with wkhtmltopdf myself. Now it has a table of contents and some nice header/footer.

Code and sample: vbnet_intro.zip Generated PDF: book_wk.pdf Comparison - PDF by GitBook + Calibre: book.pdf

Feel free to discuss and use the code if you're interested.

GeorgBraunHM commented 8 years ago

@lwchkg Thank you so much for provoding vbnet_intro.zip. I would like to give it a try. I downloaded and extracted the zip, did npm install and gitbook install. Next, I did a node gen_pdf_wk.js. But this resulted in the following error in line 153:

    Promise.all([...assetMap].map(([src, dest]) => {
                                   ^

SyntaxError: Unexpected token [
    at exports.runInThisContext (vm.js:53:16)
    at Module._compile (module.js:387:25)
    at Object.Module._extensions..js (module.js:422:10)
    at Module.load (module.js:357:32)
    at Function.Module._load (module.js:314:12)
    at Function.Module.runMain (module.js:447:10)
    at startup (node.js:148:18)
    at node.js:405:3

If I run genpdf.bat, I get the same error. What woudl be the right way to run your PDF conversion?

I am on Windows 8.1 with node v5.12.0 and npm v3.8.6.

Many thanks and best regards, Georg

lwchkg commented 8 years ago

@GeorgBraunHM Oh. I made the program with node 6. Maybe you can upgrade to node 6 and try again.

lwchkg commented 8 years ago

Here's the updated sample and the generated PDF. The installation instruction is here: Code and sample book - vbnet_intro.zip Generated PDF - book_wk.pdf

npm install -g svgexport
npm install
gitbook install

and then run by genpdf (Windows only) or node gen_pdf_wk.js. The most important change in the script is to allow more time for JavaScript code to run, which is needed to get the header to show properly. If the page headers does not show up add the arguments --javascript-delay 5000 (the unit is ms).

As you may notice the SVGs in the above PDF are defective, and a few icons are missing. Personally I run with a modded GitBook myself to render the svgs (we don't need svg->png conversion) and added a real cover page. Here is the final product. :-) vb2015 part 1.pdf

GeorgBraunHM commented 8 years ago

@lwchkg Thanks a lot for the update. I upgraded to nodejs v6.5.0 (64-bit version for windows) and run genpdf on your latest zip (from Sept. 3). I got a nice looking PDF including a TOC with page numbers. Taking a closer look, some chapters include a header, others don't. Therefore I ran genpdf --javascript-delay 5000 and the headers are included for all chapters. The pdf really looks exactly like yours on https://github.com/GitbookIO/gitbook/files/453486/book_wk.pdf. Awesome!

I have a few more questions, if you don't mind:

  1. I currenlty use wkhtmltopdf 0.12.3.1 (with patched qt), 32-bit. It seems that your pdf is created with wkhtmltopdf version 0.12.3.2. Are you using the 32-bit or 64-bit edition?
  2. My pdf has the same svg flaws as yours. How did you patch gitbook to get to your pdf at https://github.com/GitbookIO/gitbook/files/453488/vb2015.part.1.pdf?
  3. How did you add the title page on https://github.com/GitbookIO/gitbook/files/453488/vb2015.part.1.pdf?

Many thanks and best regards, Georg

jonahfang commented 8 years ago

No pdf file generated:

oot@c8d99232f630:/gitbook# node gen_pdf_wk.js %*
Running GitBook:
info: 11 plugins are installed
info: 8 explicitly listed
info: loading plugin "sunlight-highlighter"... OK
info: loading plugin "include-codeblock"... OK
info: loading plugin "styles-less"... OK
info: loading plugin "katex"... OK
info: loading plugin "search"... OK
info: loading plugin "lunr"... OK
info: loading plugin "sharing"... OK
info: loading plugin "theme-default"... OK
info: found 21 pages
info: found 23 asset files
info: compile less file:  styles/website.less
warn: "options" property is deprecated, use config.get(key) instead
warn: "options.output" property is deprecated, use "output.root()" instead
info: compile less file:  styles/pdf.less
info: compile less file:  styles/epub.less
info: compile less file:  styles/mobi.less
info: compile less file:  styles/ebook.less
warn: "this.generator" property is deprecated, use "this.output.name" instead
warn: "navigation" property is deprecated
warn: "book" property is deprecated, use "this" directly instead
info: >> generation finished with success in 34.1s !
Processing LESS asset: src/styles/wk_headerfooter.less => _ebook/wk_headerfooter.less
Processing LESS asset: src/styles/wk_toc.less => _ebook/wk_toc.less
Processing XSL asset: src/styles/wk_toc.xsl => _ebook/wk_toc.xsl
Copying asset: src/cover.html => _ebook/cover.html
Copying asset: src/styles/wk_header.html => _ebook/wk_header.html
Launching wkhtmltopdf:

I use the lastest code( (from Sept. 3), but run gitbook from docker container:

docker run \
 --rm \
 -it \
 -v $PWD:/gitbook \
 fangzx/gitbook:2.0 /bin/bash

then:

node gen_pdf_wk.js %*

My Docker file looks like this:

FROM node:6

RUN apt-get update && \
    apt-get install -y unzip  && \
    npm install gitbook-cli -g && \
    npm install svgexport -g && \
    apt-get clean && \
    rm -rf /var/cache/apt/* /var/lib/apt/lists/*

RUN apt-get update && apt-get install -y fonts-arphic-gbsn00lp

# install gitbook versions
RUN gitbook fetch 3.2.0

ENV BOOKDIR /gitbook

VOLUME $BOOKDIR

EXPOSE 4000

WORKDIR $BOOKDIR

CMD ["gitbook", "--help"]

#EOP
lwchkg commented 8 years ago

@GeorgBraunHM

  1. 64-bit edition of wkhtmltopdf 0.12.3.2 Windows. (I tried Linux also, but PT Mono rendered poorly there. Appears the web font from Paratype works poorly. Web fonts prepared by SquirrelFonts works well. )
  2. Assume you're using GitBook 3.2.0. The file you want to change is C:\Users\[your user name]\.gitbook\versions\3.2.0\lib\output\ebook\ Change the content of function function onPage(output, page) to return WebsiteGenerator.onPage(output, page);
  3. The cover page? Use Illustrator (or any program) to draw a cover, save as PDF. Then use a PDF software (e.g. http://angusj.com/pdftkb/ ) to join the PDFs together.

@jonahfang Appears that you've forgotten to install wkhtmltopdf. It should be exist in your path. (Just a note: different people want to install different version of wkhtmltopdf, because none of them is really stable.)

jonahfang commented 8 years ago

@lwchkg , thank you very much, it works.

GeorgBraunHM commented 8 years ago

@lwchkg thanks for your answers.

I will give the return WebsiteGenerator.onPage(output, page); a try somewhat later. For many (local) books, I am still on GitBook 2.6.7 (starting with 3.x, I cannot view the books via file:///... any longer. gitbook serve works, but I am providing my students with the HTML book via a simple file server, so I have to stick with file:///....

For GitBook 2.6.7, there is a flag this.convertImages = true; within file C:\Users\UserName.gitbook\versions\2.6.7\lib\generators\ebook.js. Maybe, setting this to false might help (I didn't try it yet).

In the meantime, I have applied your genpdf.bat to a book which is not sitting in a root folder (like src in your example). If I remove the line "root": "src", within book.json, the script fails. I have fixed this by changing line 218 in file gen_pdf_wk.js from config.root = rawConfig.root; to config.root = rawConfig.root ? rawConfig.root : "./";

xuv commented 8 years ago

For GitBook 2.6.7, there is a flag this.convertImages = true; within file C:\Users\UserName.gitbook\versions\2.6.7\lib\generators\ebook.js. Maybe, setting this to false might help (I didn't try it yet).

Tried with the option and indeed it works as intended. SVG are crips and part of the PDF (still tested with ebook-convert from Calibre)

jonathanpberger commented 7 years ago

@lwchkg thanks so much for doing this! I FOUND AN NPM-INSTALLABLE VERSION HERE: https://github.com/lwchkg/gitbook-pdfgen

oxFilla commented 7 years ago

I'm getting this error when using gitbook-pdfgen:

$ gitbook-pdfgen --help

/usr/local/lib/node_modules/gitbook-pdfgen/gen_pdf_wk.js:4
const childProcess = require('child_process');
^^^^^
SyntaxError: Use of const in strict mode.
    at Module._compile (module.js:439:25)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:902:3
lwchkg commented 7 years ago

@oxid-filla You're likely running a very old version of node.js. Please update to a recent version. The error message indicates that your node.js installation doesn't support ES6.

oxFilla commented 7 years ago

Ok, now it runs but two new problems. In my project and tested with your sample code I have the header with page number only on the first site after the TOC and never again. Putting your stlyes and book.json in my project I only have a TOC when I remove this line out of the book.json:

"tocXsl": "styles/wk_toc.xsl",

I'm using an ordinary SUMMARY.md.

MuyNooB commented 6 years ago

Thanks for your zip. The zip woke awesome, but a question, the gen just support adoc? I test my md book and get blank page except summary.How can i change the code to support the md book. (though bad English, wish you can read hha)

lwchkg commented 6 years ago

@MuyNooB Are you referring to me? Anyway the generator does only recognize ".html" in the output, so whether your content is ".adoc" or ".md" it shouldn't really matter. If you don't mind, you can send me the book so I can try to reproduce the error.

BTW, this is the place for the official gitbook repository. If you're talking about my plugin, it's better to post an issue on https://github.com/lwchkg/gitbook-pdfgen/issues instead.

OVGav74 commented 6 years ago

@lwchkg For those of us using gitbook.com, and with no coding background, can what you've done be published as a GitBook plugin so we can use it too?

bbinet commented 5 years ago

FYI WeasyPrint now supports font-face and table of contents with page numbers, but I've not tried to use it yet.