facebook / docusaurus

Easy to maintain open source documentation websites.
https://docusaurus.io
MIT License
55.64k stars 8.34k forks source link

PDF Download #969

Open cooltrooper opened 6 years ago

cooltrooper commented 6 years ago

๐Ÿš€ Feature

The ability for users to download an offline version of the document.

This would be ideal as a PDF with title page and table of contents.

Have you read the Contributing Guidelines on issues?

Yes

Motivation

Interested in using this as a documentation solution for client producs but PDF for print is required by some.

Pitch

A download button could be added to the top of the side nav which when clicked on could would download the file.

haraldur12 commented 6 years ago

Could we use something like react-pdf ? I might be able to take a look at it.

cooltrooper commented 6 years ago

Sorry I may have explained poorly. I believe react-pdf will render pdf's on the webpage. What I'm thinking would be a way to export your documents to a pdf as an additional format.

Like sphinx has PDF generation via LaTeX

haraldur12 commented 6 years ago

We could also achieve this with html2canvas module I think. There are a few ways to do that. However I am not sure which one would be the optimal way. If there is interest in it I assume I could take a look.

Zwitty commented 5 years ago

It might be possible to generate PDFs during the build using something like pandoc and then serve them with the document.

cooltrooper commented 5 years ago

Concatenate markdowns, parse with pandocs to pdf?

tusharf5 commented 5 years ago

We could use something like markdown-pdf to create pdf documents for markdown files during the build process and serve them. The URL for each pdf would be the same as the document one but with an extension ".pdf". For translated documents too.

We could also make this feature as optional and turned off as a default based on some siteConfig key. I can take a look into it if it's still a requested feature.

sourabhxyz commented 5 years ago

Since I wanted this feature, I implemented it using Travis. See 1, 2, 3. Ofcourse, you would need to change it according to your needs. As might be evident from one of these files, I am pushing generated pdf files here.

BenHadman commented 4 years ago

Since I wanted this feature, I implemented it using Travis. See 1, 2, 3. Ofcourse, you would need to change it according to your needs. As might be evident from one of these files, I am pushing generated pdf files here.

Please may you go into some more detail as to how I could implement this? Sorry I'm new to react and web dev in general

sourabhxyz commented 4 years ago

Since I wanted this feature, I implemented it using Travis. See 1, 2, 3. Ofcourse, you would need to change it according to your needs. As might be evident from one of these files, I am pushing generated pdf files here.

Please may you go into some more detail as to how I could implement this? Sorry I'm new to react and web dev in general

Apologies for replying late. If you want to understand my approach, start by learning about Travis and see how to convert markdown to pdf using pandoc. What I am essentially doing is whenever I make a commit to my docusaurus site, my Travis script converts those markdown files (which are listed in sidebars.json) to pdf then simply merge them and push it to my other repository.

reflectively commented 4 years ago

Would love to see this feature!

dheerajmpai commented 4 years ago

Actually React-pdf can do the job. We need to work on it

braco commented 4 years ago

Lack of this feature, or single-page output, is why my company has to move away from Docusaurus.

yangshun commented 4 years ago

@braco thanks for the feedback. What will you be moving to?

kohheepeace commented 4 years ago

@yangshun Hi I implemented node script to generate pdf file through docs.

This is generated PDF: https://drive.google.com/file/d/19P3qSwLLUHYigrxH3QXIMXmRpTFi4pKB/view

Asking

Approach

I immitated mdx-deck approach.

Author needs to manually generate and serve pdf at local by this approach.

  1. Start docusaurus project at localhost
  2. Run node script to generate pdf
  3. Serve generated pdf file from static folder
  4. Add link to pdf file in client side for reader can access pdf.

*Discussion about this approach: https://github.com/jxnblk/mdx-deck/issues/141

Demo

  1. Run docusaurus oss site at http://localhost:3000/
  2. Make hoge.js at root of oss project
    • NOTE!: this code only works oss docusaurus project now
    • please install "puppeteer", "hummus", "memory-streams"

      hoge.js

      
      const puppeteer = require('puppeteer');
      const { PDFRStreamForBuffer, createWriterToModify, PDFStreamForResponse } = require('hummus');
      const { WritableStream } = require('memory-streams');
      const fs = require('fs');

const mergePdfBlobs = (pdfBlobs) => { const outStream = new WritableStream();
const [firstPdfRStream, ...restPdfRStreams] = pdfBlobs.map(pdfBlob => new PDFRStreamForBuffer(pdfBlob)); const pdfWriter = createWriterToModify(firstPdfRStream, new PDFStreamForResponse(outStream));

restPdfRStreams.forEach(pdfRStream => pdfWriter.appendPDFPagesFromPDF(pdfRStream));

pdfWriter.end(); outStream.end();

return outStream.toBuffer(); };

let generatedPdfBlobs = [];

(async () => { const browser = await puppeteer.launch(); let page = await browser.newPage(); let nextPageUrl = 'http://localhost:3000/docs/introduction';

while (nextPageUrl) { await page.goto(${nextPageUrl}, {waitUntil: 'networkidle2'});

try {
  nextPageUrl = await page.$eval('.pagination-nav__item--next > a', (element) => {
    return element.href;
  });
} catch (e) {
  nextPageUrl = null;
}

let html = await page.$eval('article', (element) => {
  return element.outerHTML;
});

await page.setContent(html);
await page.addStyleTag({url: 'http://localhost:3000/styles.css'});
await page.addScriptTag({url: 'http://localhost:3000/styles.js'});
const pdfBlob = await page.pdf({path: "", format: 'A4', printBackground: true, margin : {top: 20, right: 15, left: 15, bottom: 20}});

generatedPdfBlobs.push(pdfBlob);

} await browser.close();

const mergedPdfBlob = mergePdfBlobs(generatedPdfBlobs); fs.writeFileSync('hoge-final.pdf', mergedPdfBlob); })();

3. Run `hoge.js`
This create `hoge-final.pdf` in root
```bash
node hoge.js

Thanks!

maxarndt commented 4 years ago

Demo

  1. Run docusaurus oss site at http://localhost:3000/
  2. Make hoge.js at root of oss project
  • NOTE!: this code only works oss docusaurus project now
  • please install "puppeteer", "hummus", "memory-streams"

@KohheePeace could you explain what oss stands for in this context? Iยดm new to docusaurus and looking for a way to export docs as PDF so I tried your demo code-snippet. Unfortunately, the structure of my project initialized with npx docusaurus-init does not match your assumptions regarding stylesheet- and js-files etc.

Thank you for sharing and working on this project!

kohheepeace commented 4 years ago

Hi, @maxarndt, sorry for my bad explanation "oss site" stands for https://github.com/facebook/docusaurus.

If you just made site by runnning npx @docusaurus/init@next init my-website classic, You need to change nextPageUrl in hoge.js, from http://localhost:3000/docs/introduction to http://localhost:3000/docs/doc1.

You need to specify initial url to start scraping. Then, Puppeteer automatically paginates and generates PDF. I hope it will work in your environment!

kohheepeace commented 4 years ago

@maxarndt I made npm package for generating PDF. https://github.com/KohheePeace/docusaurus-pdf

jonsanjuan commented 4 years ago

Hi Kohhee. I was trying to install your npm package and it was failing during the following install process. I am also rather new on Docusaurus and pdf translations, so thanks in advanced:

make: *** [Release/obj.target/hummus/src/PDFDictionaryDriver.o] Error 1 gyp ERR! build error gyp ERR! stack Error: make failed with exit code: 2 gyp ERR! stack at ChildProcess.onExit (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/build.js:194:23) gyp ERR! stack at ChildProcess.emit (events.js:321:20) gyp ERR! stack at Process.ChildProcess._handle.onexit (internal/child_process.js:275:12) gyp ERR! System Darwin 19.2.0 gyp ERR! command "/usr/local/Cellar/node/13.8.0/bin/node" "/usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "build" "--fallback-to-build" "--module=/usr/local/lib/node_modules/hummus/binding/hummus.node" "--module_name=hummus" "--module_path=/usr/local/lib/node_modules/hummus/binding" "--napi_version=5" "--node_abi_napi=napi" gyp ERR! cwd /usr/local/lib/node_modules/hummus gyp ERR! node -v v13.8.0 gyp ERR! node-gyp -v v5.1.0 gyp ERR! not ok node-pre-gyp ERR! build error node-pre-gyp ERR! stack Error: Failed to execute '/usr/local/Cellar/node/13.8.0/bin/node /usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js build --fallback-to-build --module=/usr/local/lib/node_modules/hummus/binding/hummus.node --module_name=hummus --module_path=/usr/local/lib/node_modules/hummus/binding --napi_version=5 --node_abi_napi=napi' (1) node-pre-gyp ERR! stack at ChildProcess. (/usr/local/lib/node_modules/hummus/node_modules/node-pre-gyp/lib/util/compile.js:83:29) node-pre-gyp ERR! stack at ChildProcess.emit (events.js:321:20) node-pre-gyp ERR! stack at maybeClose (internal/child_process.js:1026:16) node-pre-gyp ERR! stack at Process.ChildProcess._handle.onexit (internal/child_process.js:286:5) node-pre-gyp ERR! System Darwin 19.2.0 node-pre-gyp ERR! command "/usr/local/Cellar/node/13.8.0/bin/node" "/usr/local/lib/node_modules/hummus/node_modules/.bin/node-pre-gyp" "install" "--fallback-to-build" node-pre-gyp ERR! cwd /usr/local/lib/node_modules/hummus node-pre-gyp ERR! node -v v13.8.0 node-pre-gyp ERR! node-pre-gyp -v v0.10.3 node-pre-gyp ERR! not ok Failed to execute '/usr/local/Cellar/node/13.8.0/bin/node /usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js build --fallback-to-build --module=/usr/local/lib/node_modules/hummus/binding/hummus.node --module_name=hummus --module_path=/usr/local/lib/node_modules/hummus/binding --napi_version=5 --node_abi_napi=napi' (1) npm ERR! code ELIFECYCLE npm ERR! errno 1 npm ERR! hummus@1.0.108 install: node-pre-gyp install --fallback-to-build $EXTRA_NODE_PRE_GYP_FLAGS npm ERR! Exit status 1 npm ERR! npm ERR! Failed at the hummus@1.0.108 install script. npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

kohheepeace commented 4 years ago

@jonsanjuan Thanks for your report! I think it is better to talk about this in my repo. Could you post issue in my repo?

jonsanjuan commented 4 years ago

will do

ritu-sehrawat commented 4 years ago

guys...I am not able to convert static pages created through Docusaurus to pdf...can you please help

cmdcolin commented 4 years ago

I really like @KohheePeace solution and especially has the opportunity to load mdx components too in a way pure pandoc+md can not. I managed to get it running very quickly which is great.

I think pandoc, if it is your solution, might also be a good option too, but is a bit harder to setup

If you simply run like

pandoc *.md -o output.pdf

This looks reasonably good, has a classic "academic paper" style latex font, but it lacks the ordering. I wanted to get some nice ordering and make it use sort of the "sidebar.json" ordering from the docusaurus

I made this small command/bash script

# make_pdf.sh
for i in $(node read_sidebar.js); do
  # trim off the header of the markdown docs e.g. the first 5 lines of each markdown file
  # otherwise pandoc gets confused and takes the title from the last element in the list
  tail -n +5 $i; 
done|pandoc title.md - --toc -o output.pdf

Note the title.md at the bottom, I put it there since I don't want the title's header stripped off

It adds, in addition to everything that docusaurus docs have, a special title.md which just contains these contents

The title.md file:

---
title: Your Product
authors: Optional author list
abstract: Subtext for your product
---

Optionally more content that goes before the TOC  here

Read_sidebar.js looks like

// read_sidebar.js
const fs = require('fs')

const sidebar = JSON.parse(fs.readFileSync('../sidebars.json'))

function readTree(tree, ret = []) {
  for (elt in tree) {
    if (typeof tree[elt] === 'object') {
      readTree(tree[elt], ret)
    }
    if (tree[elt]) {
      ret.push(tree[elt] + '.md')
    }
  }
  return ret
}
console.log(readTree(sidebar).join('\n'))

This just reads a sidebar.json and outputs the list of files to read, in the order that they are specified in the sidebar.json

If you have substructure in your sidebar like different sections I recommend making something like this

// sidebar.json. note if you have a sidebar.js, just convert it to json and make it require.resolve('sidebar.json') instead of the js
{
  "someSidebar": {
    "User guide": [
      "user_guide", // this file has a markdown with a top level # header
      "user_navigation", // rest of these files use two ## headers

    ],
    "Configuration guide": [
      "config_guide", // this file has a markdown with a top level # header
      "config_stuff", // rest of these files use two ## headers

    ],
    "Developer resources": [
      "developer_guide",
      "developer_code_organization",   ]
    ]
  }
}

This produces a nice table of contents where each section that corresponds to a section of the sidebars.json also gets a section in the pdf table of contents

Example from our docs https://jbrowse.org/jb2/jbrowse2.pdf (from https://jbrowse.org/jb2/docs/)

Josh-Cena commented 2 years ago

Hi @kohheepeace very cool tool! Do you have plans to turn that into a Docusaurus plugin? I believe you just need to hook into postBuild to read the HTML files. Also, you can choose to wrap the doc footer component and inject a "download PDF" button. (Or leave that to the user) If that is done, we will probably close this as resolved by community.

kohheepeace commented 2 years ago

@Josh-Cena thanks for your proposal. I'm currently busy on another project, so I can't implement this as soon as possible. docusaurus-pdf and mr-pdf are MIT license so please feel free to fork or create completely as another project ๐Ÿ‘

ar-to commented 2 years ago

Here is an interesting solutions built into the CI process. I did not try but it depend on Prince which is almost like $4k a pop to license!

Also found https://www.npmjs.com/package/docusaurus-plugin-papersaurus?activeTab=readme which I did test and it looked promising up until no pdfs were generated as I described in my issue to that project here.

Pdf generation is pretty important to have in a documentation project like this. A bit surprised its not among the first features. First time testing docusaurus and part of the requirements I have is to generate a PDF so I guess I'll have to try a method outside of docusaurus. If I can get something working I'll try to create a plugin.

slorber commented 1 year ago

Someone recently submitted this package to our community resources: https://github.com/jean-humann/docs-to-pdf

alishefaee commented 3 months ago

For anyone desperate for documentation tools with advanced PDF export, I found doxygen to be stable and mature.