jgazeau / website2pdf

Node library to print PDFs from a website following sitemap protocol
MIT License
26 stars 6 forks source link

[question]: change base URL of links in PDFs #61

Closed whyboris closed 2 years ago

whyboris commented 2 years ago

Question?

Everything is working beautifully but because when generating the PDF the website2pdf utility is using the sitemap.xml it generates PDFs with links that look like localhost:1313/introduction rather than mywebsite.com/introduction πŸ˜…

Is there already something I can do to fix it, or this would require a new feature? πŸ™‡

jgazeau commented 2 years ago

Hi @whyboris , Indeed, I guess that you are using website2pdf on a local Hugo deployment of your website (port 1313 πŸ˜ƒ ). The fact is that if you are using relative links the browser will automatically append the host in front of it. Which leads to http://localhost:1313/introduction if you are in local. Unfortunately the library cannot change that, because puppeteer doesn't have that option yet (check this issue) and that hot changing the html is not the proper solution.

In your cases I see two solutions:

I'll close this issue for now but if chromium and puppeteer teams decide to add such an option I'll take that into consideration and reopen the current issueπŸ˜‰

whyboris commented 2 years ago

Thank you for the quick response πŸ™‡

I tried this out and it works well-enough (added to the bottom of the baseof.html) πŸ‘ :

    <script>

      const anchors = document.getElementsByTagName("a");

      for (let i = 0; i < anchors.length; i++) {
        const old = anchors[i].href;
        const new_url = old.replace("http://localhost:1313/", "https://utilitarianism.net/");
        anchors[i].href = new_url;
      }

    </script>

I'll have to add some more code to avoid messing up all the footnote links, but I'm sure it will work out πŸ’ͺ