Open cooltrooper opened 6 years ago
Could we use something like react-pdf
? I might be able to take a look at it.
Sorry I may have explained poorly. I believe react-pdf will render pdf's on the webpage. What I'm thinking would be a way to export your documents to a pdf as an additional format.
Like sphinx has PDF generation via LaTeX
We could also achieve this with html2canvas module I think. There are a few ways to do that. However I am not sure which one would be the optimal way. If there is interest in it I assume I could take a look.
It might be possible to generate PDFs during the build using something like pandoc and then serve them with the document.
Concatenate markdowns, parse with pandocs to pdf?
We could use something like markdown-pdf to create pdf documents for markdown files during the build process and serve them. The URL for each pdf would be the same as the document one but with an extension ".pdf". For translated documents too.
We could also make this feature as optional and turned off as a default based on some siteConfig key. I can take a look into it if it's still a requested feature.
Since I wanted this feature, I implemented it using Travis. See 1, 2, 3. Ofcourse, you would need to change it according to your needs. As might be evident from one of these files, I am pushing generated pdf files here.
Please may you go into some more detail as to how I could implement this? Sorry I'm new to react and web dev in general
Since I wanted this feature, I implemented it using Travis. See 1, 2, 3. Ofcourse, you would need to change it according to your needs. As might be evident from one of these files, I am pushing generated pdf files here.
Please may you go into some more detail as to how I could implement this? Sorry I'm new to react and web dev in general
Apologies for replying late. If you want to understand my approach, start by learning about Travis and see how to convert markdown to pdf using pandoc. What I am essentially doing is whenever I make a commit to my docusaurus site, my Travis script converts those markdown files (which are listed in sidebars.json) to pdf then simply merge them and push it to my other repository.
Would love to see this feature!
Actually React-pdf can do the job. We need to work on it
Lack of this feature, or single-page output, is why my company has to move away from Docusaurus.
@braco thanks for the feedback. What will you be moving to?
@yangshun Hi I implemented node script to generate pdf file through docs.
This is generated PDF: https://drive.google.com/file/d/19P3qSwLLUHYigrxH3QXIMXmRpTFi4pKB/view
I immitated mdx-deck approach.
Author needs to manually generate and serve pdf at local by this approach.
*Discussion about this approach: https://github.com/jxnblk/mdx-deck/issues/141
hoge.js
at root of oss project
hoge.js
const puppeteer = require('puppeteer');
const { PDFRStreamForBuffer, createWriterToModify, PDFStreamForResponse } = require('hummus');
const { WritableStream } = require('memory-streams');
const fs = require('fs');
const mergePdfBlobs = (pdfBlobs) => {
const outStream = new WritableStream();
const [firstPdfRStream, ...restPdfRStreams] = pdfBlobs.map(pdfBlob => new PDFRStreamForBuffer(pdfBlob));
const pdfWriter = createWriterToModify(firstPdfRStream, new PDFStreamForResponse(outStream));
restPdfRStreams.forEach(pdfRStream => pdfWriter.appendPDFPagesFromPDF(pdfRStream));
pdfWriter.end(); outStream.end();
return outStream.toBuffer(); };
let generatedPdfBlobs = [];
(async () => { const browser = await puppeteer.launch(); let page = await browser.newPage(); let nextPageUrl = 'http://localhost:3000/docs/introduction';
while (nextPageUrl) {
await page.goto(${nextPageUrl}
, {waitUntil: 'networkidle2'});
try {
nextPageUrl = await page.$eval('.pagination-nav__item--next > a', (element) => {
return element.href;
});
} catch (e) {
nextPageUrl = null;
}
let html = await page.$eval('article', (element) => {
return element.outerHTML;
});
await page.setContent(html);
await page.addStyleTag({url: 'http://localhost:3000/styles.css'});
await page.addScriptTag({url: 'http://localhost:3000/styles.js'});
const pdfBlob = await page.pdf({path: "", format: 'A4', printBackground: true, margin : {top: 20, right: 15, left: 15, bottom: 20}});
generatedPdfBlobs.push(pdfBlob);
} await browser.close();
const mergedPdfBlob = mergePdfBlobs(generatedPdfBlobs); fs.writeFileSync('hoge-final.pdf', mergedPdfBlob); })();
3. Run `hoge.js`
This create `hoge-final.pdf` in root
```bash
node hoge.js
Thanks!
Demo
- Run docusaurus oss site at http://localhost:3000/
- Make
hoge.js
at root of oss project
- NOTE!: this code only works oss docusaurus project now
- please install "puppeteer", "hummus", "memory-streams"
@KohheePeace could you explain what oss stands for in this context? Iยดm new to docusaurus and looking for a way to export docs as PDF so I tried your demo code-snippet. Unfortunately, the structure of my project initialized with npx docusaurus-init
does not match your assumptions regarding stylesheet- and js-files etc.
Thank you for sharing and working on this project!
Hi, @maxarndt, sorry for my bad explanation "oss site" stands for https://github.com/facebook/docusaurus.
If you just made site by runnning npx @docusaurus/init@next init my-website classic
, You need to change nextPageUrl
in hoge.js
, from http://localhost:3000/docs/introduction
to http://localhost:3000/docs/doc1
.
You need to specify initial url to start scraping. Then, Puppeteer automatically paginates and generates PDF. I hope it will work in your environment!
@maxarndt I made npm package for generating PDF. https://github.com/KohheePeace/docusaurus-pdf
Hi Kohhee. I was trying to install your npm package and it was failing during the following install process. I am also rather new on Docusaurus and pdf translations, so thanks in advanced:
make: *** [Release/obj.target/hummus/src/PDFDictionaryDriver.o] Error 1
gyp ERR! build error
gyp ERR! stack Error: make
failed with exit code: 2
gyp ERR! stack at ChildProcess.onExit (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/build.js:194:23)
gyp ERR! stack at ChildProcess.emit (events.js:321:20)
gyp ERR! stack at Process.ChildProcess._handle.onexit (internal/child_process.js:275:12)
gyp ERR! System Darwin 19.2.0
gyp ERR! command "/usr/local/Cellar/node/13.8.0/bin/node" "/usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "build" "--fallback-to-build" "--module=/usr/local/lib/node_modules/hummus/binding/hummus.node" "--module_name=hummus" "--module_path=/usr/local/lib/node_modules/hummus/binding" "--napi_version=5" "--node_abi_napi=napi"
gyp ERR! cwd /usr/local/lib/node_modules/hummus
gyp ERR! node -v v13.8.0
gyp ERR! node-gyp -v v5.1.0
gyp ERR! not ok
node-pre-gyp ERR! build error
node-pre-gyp ERR! stack Error: Failed to execute '/usr/local/Cellar/node/13.8.0/bin/node /usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js build --fallback-to-build --module=/usr/local/lib/node_modules/hummus/binding/hummus.node --module_name=hummus --module_path=/usr/local/lib/node_modules/hummus/binding --napi_version=5 --node_abi_napi=napi' (1)
node-pre-gyp ERR! stack at ChildProcess.node-pre-gyp install --fallback-to-build $EXTRA_NODE_PRE_GYP_FLAGS
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the hummus@1.0.108 install script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
@jonsanjuan Thanks for your report! I think it is better to talk about this in my repo. Could you post issue in my repo?
will do
guys...I am not able to convert static pages created through Docusaurus to pdf...can you please help
I really like @KohheePeace solution and especially has the opportunity to load mdx components too in a way pure pandoc+md can not. I managed to get it running very quickly which is great.
I think pandoc, if it is your solution, might also be a good option too, but is a bit harder to setup
If you simply run like
pandoc *.md -o output.pdf
This looks reasonably good, has a classic "academic paper" style latex font, but it lacks the ordering. I wanted to get some nice ordering and make it use sort of the "sidebar.json" ordering from the docusaurus
I made this small command/bash script
# make_pdf.sh
for i in $(node read_sidebar.js); do
# trim off the header of the markdown docs e.g. the first 5 lines of each markdown file
# otherwise pandoc gets confused and takes the title from the last element in the list
tail -n +5 $i;
done|pandoc title.md - --toc -o output.pdf
Note the title.md at the bottom, I put it there since I don't want the title's header stripped off
It adds, in addition to everything that docusaurus docs have, a special title.md which just contains these contents
The title.md file:
---
title: Your Product
authors: Optional author list
abstract: Subtext for your product
---
Optionally more content that goes before the TOC here
Read_sidebar.js looks like
// read_sidebar.js
const fs = require('fs')
const sidebar = JSON.parse(fs.readFileSync('../sidebars.json'))
function readTree(tree, ret = []) {
for (elt in tree) {
if (typeof tree[elt] === 'object') {
readTree(tree[elt], ret)
}
if (tree[elt]) {
ret.push(tree[elt] + '.md')
}
}
return ret
}
console.log(readTree(sidebar).join('\n'))
This just reads a sidebar.json and outputs the list of files to read, in the order that they are specified in the sidebar.json
If you have substructure in your sidebar like different sections I recommend making something like this
// sidebar.json. note if you have a sidebar.js, just convert it to json and make it require.resolve('sidebar.json') instead of the js
{
"someSidebar": {
"User guide": [
"user_guide", // this file has a markdown with a top level # header
"user_navigation", // rest of these files use two ## headers
],
"Configuration guide": [
"config_guide", // this file has a markdown with a top level # header
"config_stuff", // rest of these files use two ## headers
],
"Developer resources": [
"developer_guide",
"developer_code_organization", ]
]
}
}
This produces a nice table of contents where each section that corresponds to a section of the sidebars.json also gets a section in the pdf table of contents
Example from our docs https://jbrowse.org/jb2/jbrowse2.pdf (from https://jbrowse.org/jb2/docs/)
Hi @kohheepeace very cool tool! Do you have plans to turn that into a Docusaurus plugin? I believe you just need to hook into postBuild
to read the HTML files. Also, you can choose to wrap the doc footer component and inject a "download PDF" button. (Or leave that to the user) If that is done, we will probably close this as resolved by community.
@Josh-Cena thanks for your proposal. I'm currently busy on another project, so I can't implement this as soon as possible. docusaurus-pdf and mr-pdf are MIT license so please feel free to fork or create completely as another project ๐
Here is an interesting solutions built into the CI process. I did not try but it depend on Prince which is almost like $4k a pop to license!
Also found https://www.npmjs.com/package/docusaurus-plugin-papersaurus?activeTab=readme which I did test and it looked promising up until no pdfs were generated as I described in my issue to that project here.
Pdf generation is pretty important to have in a documentation project like this. A bit surprised its not among the first features. First time testing docusaurus and part of the requirements I have is to generate a PDF so I guess I'll have to try a method outside of docusaurus. If I can get something working I'll try to create a plugin.
Someone recently submitted this package to our community resources: https://github.com/jean-humann/docs-to-pdf
For anyone desperate for documentation tools with advanced PDF export, I found doxygen
to be stable and mature.
๐ Feature
The ability for users to download an offline version of the document.
This would be ideal as a PDF with title page and table of contents.
Have you read the Contributing Guidelines on issues?
Yes
Motivation
Interested in using this as a documentation solution for client producs but PDF for print is required by some.
Pitch
A download button could be added to the top of the side nav which when clicked on could would download the file.