arachnys / athenapdf

Drop-in replacement for wkhtmltopdf built on Go, Electron and Docker
MIT License
2.26k stars 186 forks source link

There is a difference in the final pdf generated using athenapdf.com and our own docker instance #144

Closed aravinthchandramouli closed 5 years ago

aravinthchandramouli commented 6 years ago

We use the same input html file url in both athenapdf.com and as an input for the docker instance running on our server. We see a lot of difference between the two outputs (especially page margins and paddings). We made sure we use the latest version of athena API.

Attached the appropriately named generated PDF files from both.

our-docker-instance-generated.pdf athenapdf-com-genarated.pdf

MrSaints commented 6 years ago

Since late last year, we have been trial'ing parts of v3 on our demo website. That is likely to explain the difference.

basz commented 6 years ago

On a quick trail run with v3 I noticed that stylesheets werent applies for loading a uri. Is there some additional Param or option I should set?

MrSaints commented 6 years ago

@basz They should always load. If it's possible, can I have a look at the HTML for the URI you tried to load? (omit confidential details)

basz commented 6 years ago

HI @MrSaints this source https://gist.github.com/basz/67f047443f261d921486e8a5c2f80b6b renders in v2 as https://share.bushbaby.nl/poQrP8Zrtq/ and in v3 as https://share.bushbaby.nl/6LeSPy0Z8f/

MrSaints commented 6 years ago

Interesting @basz. Is /css/admin/reporting.css accessible ? Also, can you remove media="screen, print" (I don't think it'll change anything, but worth trying).

basz commented 6 years ago

yup. /css/admin/reporting.css contents here https://gist.github.com/basz/56984763f36daa24c746ab13d6714427 but according to the apache logs, it isn't downloaded

I removed the media attribute to no luck

MrSaints commented 6 years ago

Silly Q, is the main page / URL showing up in the Apache logs?

basz commented 6 years ago

yes, as "GET /admin/orders/reporting/order-costing?standalone=1&i=11411-13600-18113&sess=xx HTTP/1.1" 200 4636 "-" "Go-http-client/1.1"

MrSaints commented 6 years ago
basz commented 6 years ago
dockerun -p 8080:8080 --rm arachnysdocker/athenapdf-service:3 -D

then

curl -X GET 'http://localhost:8080/process?uri=https%3A%2F%2Flab.sandalinos.nl%3A443%2Fadmin%2Forders%2Freporting%2Forder-costing%3Fstandalone%3D1%26i%3D11411-13600-18113%26sess%3D67xxx&fetcher=http' > test.pdf && open test.pdf 

which gives log

2018/02/26 16:04:13 Initializing logging reporter
ts=2018-02-26T16:04:13.554318999Z caller=main.go:103 transport=HTTP version=3.0.0 debug=true addr=:8080

2018/02/26 16:04:28 worker=3 queue=0 fetcher=http converter= uploader=
2018/02/26 16:04:28 Reporting span 54722908264776bf:bd6cb306359138f:54722908264776bf:1
2018/02/26 16:04:29 Reporting span 54722908264776bf:2566358f78c271cc:54722908264776bf:1
2018/02/26 16:04:29 Reporting span 54722908264776bf:18fb40919032af49:54722908264776bf:1
2018/02/26 16:04:29 Reporting span 54722908264776bf:54722908264776bf:0:1
MrSaints commented 6 years ago

From the looks of it, it's relying on a HTTP fetcher (the fetcher=http bit). So what that means is it will download the HTML for conversion locally. The URL for the stylesheet will not resolve if we are converting locally. You can either change that URL to include your full host or get rid of fetcher=http, and set mime_type=text/html.

EDIT: I haven't quite figured out a nice UX for handling the conversion of N content types.

basz commented 6 years ago

mime_type=text/html

yes! awesomeness. thank you!

basz commented 6 years ago

No v3 docs yet are there?

MrSaints commented 6 years ago

Unfortunately, no, since it's subject to further changes.