alvarcarto / url-to-pdf-api

Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.
MIT License
7.01k stars 774 forks source link

Fails to navigate on a non-.com #33

Closed thebetterjort closed 6 years ago

thebetterjort commented 6 years ago

We have an internal site that I'm trying to grab PDFS from on the fly. The app works fine on any public url, but not on our internal.

2017-10-12T14:48:19.108Z - info: [pdf-core.js] Set browser viewport..
2017-10-12T14:48:19.109Z - info: [pdf-core.js] Emulate @media screen..
2017-10-12T14:48:19.109Z - info: [pdf-core.js] Goto url https://cef.erwf.nin.asn/ ..
2017-10-12T14:48:21.395Z - error: [pdf-core.js] Error when rendering page: Error: Failed to navigate: https://cef.erwf.nin.asn/
2017-10-12T14:48:21.396Z - error: [pdf-core.js] Error: Failed to navigate: https://cef.erwf.nin.asn/
    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)
    at <anonymous>
2017-10-12T14:48:21.396Z - info: [pdf-core.js] Closing browser..
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request headers: host=localhost:9000, user-agent=Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0, accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, accept-language=en-US,en;q=0.5, accept-encoding=gzip, deflate, connection=keep-alive, upgrade-insecure-requests=1
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request parameters:
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request body:
2017-10-12T14:48:21.408Z - error: [error-logger.js] Error: Failed to navigate: https://cef.erwf.nin.asn/
    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)
    at <anonymous> 'Error: Failed to navigate: https://cef.erwf.nin.asn/\n    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)\n    at <anonymous>'
GET /api/render?url=https://cef.erwf.nin.asn/ 500 2484.461 ms - -
kimmobrunfeldt commented 6 years ago

I'm not sure why this happens. You could try to use ignoreHttpsErrors=true query parameter.

You can also try starting it with HEADED=true npm start. This will spawn visible Chrome window when request is done. Note: All requests will return with 500 internal server errors because PDF rendering is not supported in headed mode in Chrome. https://github.com/GoogleChrome/puppeteer/issues/576. From the chrome UI, you could see what's happening.

kimmobrunfeldt commented 6 years ago

HEADED env var was removed and DEBUG_MODE added. So you could try DEBUG_MODE=true npm start. It will also leave the browser open after a request.

kimmobrunfeldt commented 6 years ago

Closed due to inactivity.