Update docs for local html file path

appukuttan-shailesh commented 4 years ago

Hi,

Firstly thanks for this package! After having tested 4 other popular Python packages for converting HTML to PDF (and failed for some reason or another), I found this one and it works like a charm. Easy to setup (fully pip installable) and because it uses Chromium, provides an output exactly like printing via the browser.

A small suggestion regarding updating/clarifying the documentation. I was trying to convert a local HTML file via a Python script, and was doing the following:

content = save_pdf(output_file="report.pdf", html="report.html")

I simply got a blank output with "report.html" printed.

My mistakes which weren't obvious to me:

the filepath is to be passed via the url parameter and not html
Use absolute file paths, or else you get the following error:

ValueError: relative path can't be expressed as a file URI

So the correct syntax was:

from pyppdf import save_pdf
content = save_pdf(output_file="report.pdf", url="/home/shailesh/Work/VF/pdf_report/report.html")

Additionally to pass arguments, this can be changed to: content = save_pdf(output_file="report.pdf", url="/home/shailesh/Work/VF/pdf_report/report.html", args_dict={"pdf":{"format":"A4", "landscape":False, "printBackground":True, "margin":{"top":'0.25in', "right":'0.25in', "bottom":'0.25in', "left":'0.25in'}}})

This worked for me and got my work done. Sharing this here so as to be useful for any users in the future having similar problems.

I am still not sure what the html argument is used for. I even tried reading the HTML source file and feeding it the HTML code as string, but simply got an empty page as PDF.

Cheers!

kiwi0fruit commented 4 years ago

Thanks for feedback.

Yeah. I guess there are no examples of using Python API (as I use pyppdf as CLI only). I let this info hang here until I would have enthusiasm to add this to the documentation.

appukuttan-shailesh commented 4 years ago

Happy to help.

Just to add... I kept trying other stuff, and I ended up using the html parameter (instead of url; value is HTML source code as string) as I had to alter the contents of the HTML before converting to PDF. This worked (for me) only when used in conjunction with goto="temp" and dir_. Not sure why the other options of goto didn't work for me.

kiwi0fruit commented 4 years ago

I also use "temp" option for source code. I implemented all approaches that I saw in other to PDF converters via puppeteer/pyppeteer but they didn't work as intended (or didn't work at all maybe) so I ended with creating a temporal HTML file (other not working approaches are other goto options now).

And in most cases it should work without specifying dir_. So it's strange in this regard. Why didn't it work without setting dir_?

appukuttan-shailesh commented 4 years ago

Actually it did work even without _dir (i.e. just goto="temp"), but what I meant was that it failed for the other goto options ("setContent", "data-text-html")... or maybe I wasn't using them as required. I for sure didn't understand what the latter two options were meant for.

kiwi0fruit / pyppdf

Update docs for local html file path #6