jean-humann / docs-to-pdf

Generate PDF for document website 🧑‍🔧
https://www.npmjs.com/package/docs-to-pdf
MIT License
104 stars 18 forks source link

Quick Start example doesn't work #288

Open nullromo opened 1 year ago

nullromo commented 1 year ago

I tried running the example from the README

npx docs-to-pdf --initialDocURLs="https://docusaurus.io/docs/" --contentSelector="article" --paginationSelector="a.pagination-nav__link.pagination-nav__link--next" --excludeSelectors=".margin-vert--xl a,[class^='tocCollapsible'],.breadcrumbs,.theme-edit-this-page" --coverImage="https://docusaurus.io/img/docusaurus.png" --coverTitle="Docusaurus v2"

and I got this error:

[10.10.2023 11:08.19.379] [DEBUG] Using Chromium from /home/kkovacs/.cache/puppeteer/chrome/linux-117.0.5938.149/chrome-linux64/chrome
[10.10.2023 11:08.19.607] [DEBUG] Chrome user data dir: /tmp/puppeteer_dev_chrome_profile-2V52e1
[10.10.2023 11:08.19.646] [LOG]   Retrieving html from https://docusaurus.io/docs/
[10.10.2023 11:08.21.047] [DEBUG] Found 0 elements
[10.10.2023 11:08.21.049] [LOG]   Success
[10.10.2023 11:08.21.051] [LOG]   Retrieving html from https://docusaurus.io/docs/category/getting-started
[10.10.2023 11:08.22.165] [DEBUG] Found 0 elements
[10.10.2023 11:08.22.166] [LOG]   Success

...

[10.10.2023 11:09.23.630] [LOG]   Success
[10.10.2023 11:09.23.634] [LOG]   Retrieving html from https://docusaurus.io/docs/deployment
[10.10.2023 11:09.25.372] [DEBUG] Found 6 elements
[10.10.2023 11:09.25.379] [DEBUG] Clicking summary: How much resource (person-hours, money) am I willing to invest in this?
[10.10.2023 11:09.26.267] [DEBUG] Clicking summary: How much server-side configuration would I need?
[10.10.2023 11:09.27.104] [DEBUG] Clicking summary: Do I have needs to cooperate?
[10.10.2023 11:09.27.944] [DEBUG] Clicking summary: GitHub action files
[10.10.2023 11:09.28.771] [DEBUG] Clicking summary: GitHub action file
[10.10.2023 11:09.28.780] [ERROR] Error: Node is either not clickable or not an Element
    at CdpElementHandle.clickablePoint (/home/kkovacs/.npm/_npx/c16ac64a6c7aba73/node_modules/puppeteer-core/lib/cjs/puppeteer/api/ElementHandle.js:680:23)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async CdpElementHandle.<anonymous> (/home/kkovacs/.npm/_npx/c16ac64a6c7aba73/node_modules/puppeteer-core/lib/cjs/puppeteer/api/ElementHandle.js:258:32)
    at async CdpElementHandle.click (/home/kkovacs/.npm/_npx/c16ac64a6c7aba73/node_modules/puppeteer-core/lib/cjs/puppeteer/api/ElementHandle.js:710:30)
    at async CdpElementHandle.<anonymous> (/home/kkovacs/.npm/_npx/c16ac64a6c7aba73/node_modules/puppeteer-core/lib/cjs/puppeteer/api/ElementHandle.js:261:36)
    at async openDetails (/home/kkovacs/.npm/_npx/c16ac64a6c7aba73/node_modules/docs-to-pdf/lib/utils.js:212:13)
    at async generatePDF (/home/kkovacs/.npm/_npx/c16ac64a6c7aba73/node_modules/docs-to-pdf/lib/utils.js:82:21)

Just wanted to point this out because I'm struggling to get this to work on my own site, so I wanted a working example reference.

nullromo commented 1 year ago

It looks to me like the problem stems from a <details> element inside a non-default tab of the Tabs container widget. This is the page that has the issue.

This part seems to work fine (first tab, "Same"):

image

But this part causes the error (second tab, "Remote"):

image

Somehow, the generator doesn't think the "GitHub action file" <details> element is clickable, but it does see the "GitHub action files" one as clickable. My guess is that the generator needs to first click on the "Remote" tab in order for the "GitHub action file" element to be rendered in the DOM. This is because the Tabs container works by adding and removing the hidden attribute from its panes. So the PDF generator is trying to click on a hidden element and not seeing it. That's my best guess anyway.

karl-cardenas-coding commented 12 months ago

I can replicate the same exact behavior. Not sure what the workaround is 😢

nullromo commented 12 months ago

Here are my suggestions:

Long-Term

An actual fix for this would involve modifying the code to artificially click on the first tab, copy its contents into the DOM, then click on the second tab, etc. This would place all the content from the <Tabs> into the resulting PDF.

Short-Term

A quick and dirty solution would be to modify the code to make sure the virtual browser doesn't try to click anything that's not clickable (or to just catch the error and move on). This would make the program not crash, but it would only ever show the first tab.

Alternative Workaround

You can remove the <Tabs> feature from your Docusaurus project altogether. Stacked <details> elements do the trick almost as well, and they don't require JavaScript to work*. This is what I am doing in my project.

So if you are using <Tabs> and you aren't married to the way they look, I suggest converting

<Tabs>
<TabItem label="label A">{content A}</TabItem>
<TabItem label="label B">{content B}</TabItem>
...
</Tabs>

into

<details><summary>{label A}</summary>{content A}</details>
<details><summary>{label B}</summary>{content B}</details>
...

*Previously the way I was distributing my docs for offline use was by running a build and then just zipping up the build folder. To read the docs, you unzip the folder and open up index.html in a browser. This works for viewing the files, but no dynamic Javascript elements will work (see the numerous issues about this that have been opened in the Docusaurus repo).

karl-cardenas-coding commented 12 months ago

Yup @nullromo that's definitely an option. One option I was entertaining that fits my use case is to ignore those details elements. Any idea on how I could achieve this with the excludeSelector?

nullromo commented 12 months ago

@karl-cardenas-coding I haven't looked into the source code of this repo much, but assuming the excludeSelector is using something like document.querySelectorAll, the selector can be any CSS selector. So if I'm not mistaken, you can just use the details type selector to match all elements of that type. <details> is a native HTML element type.

So try --excludeSelectors="details".

rusticrajp commented 11 months ago

any update on this?

karl-cardenas-coding commented 11 months ago

I tried selecting the details elements and no luck. As a workaround, I removed all details elements when creating a PDF. In my use case, these are rarely used so I can get away with it. But yeah, shame it doesn't work as this tool is super neat.