arachnys / athenapdf

Drop-in replacement for wkhtmltopdf built on Go, Electron and Docker
MIT License
2.26k stars 186 forks source link

PDF conversion getting timed out #153

Open aravinthchandramouli opened 6 years ago

aravinthchandramouli commented 6 years ago

Getting the following response when trying to convert an HTML to PDF. {"error":"conversion timed out"} HTML works fine.

djackson-saa commented 5 years ago

I am experiencing something very similar. When I attempt to convert http://blog.arachnys.com/ to a PDF I get "The server didn't respond in time." However if I hit the base URL of my host I get a status of "online".

MrSaints commented 5 years ago

You can try increasing the timeout, but I reckon this is related to: https://github.com/arachnys/athenapdf/issues/125. In the case of the blog, it is probably because of the large images. It also doesn't help that this converter is fairly bulky (relying on Electron, and running with Xvfb).

Out of curiosity, what is your [conversion] workflow like? I ask because in many cases as of late, it is far simpler, and more reliable to run conversions through Function-as-a-Service (serverless). You do not really need a full-blown microservice that is always-on (though it depends).

djackson-saa commented 5 years ago

Thanks for the quick response - much appreciated. I increased my timeout to 60 seconds and it still timed out. I am running the microservice inside of an OpenShift container. I also replaced the URL with https://efdsearch.senate.gov/search/home/ to see if that would make a difference but it timed out as well.

We have several python web applications that create PDFs using pdfkit and wkhtmltopdf. We are wanting to see if we can create a micoservice to do this for us so we don't have to include those packages in ever project.

MrSaints commented 5 years ago

Okay, that sounds quite suspect. It may be a configuration issue, e.g. not enough resources or privilege. I am certainly able to convert said URLs locally using the latest stable tagged release.

We are wanting to see if we can create a micoservice to do this for us so we don't have to include those packages in ever project.

That sounds sensible. If it is as simple as that, perhaps you can get away with a FaaS like I mentioned earlier. Example: https://us-central1-courtsite-terraform.cloudfunctions.net/pdfByURL?url=http://blog.arachnys.com/ (I'll keep this running for about an hour just so you can test).

The code, and instructions for running the above is on https://github.com/Courtsite/shuttlepdf. This assumes that latency is not a huge issue. It is arguably slower in some cases, but it is more reliable, and scalable.

djackson-saa commented 5 years ago

Let me see if I can chat with our OpenShift administrators to see if there is something I am missing. BTW - I created the container from the arachnysdocker/athenapdf-service image from docker.

Unfortunately we are not allowed to use anything in the cloud related at the moment. I know, I know - believe me we are fighting the good fight on that front.

That being said, I think it will boil down to something I did wrong or some permissions issue with the OpenShift cluster

MrSaints commented 5 years ago

Unfortunately we are not allowed to use anything in the cloud related at the moment. I know, I know - believe me we are fighting the good fight on that front.

Ah I see, fair enough. Best of luck 😁

djackson-saa commented 5 years ago

Sorry to keep bothering you but does he microservice require any special permissions in order to convert URLs to PDFs? I want to be sure I give our admins the right information

MrSaints commented 5 years ago

If running in a standard Docker set-up, it does not really need any additional flags. I would double check the resource constraints placed on it, and the logs of the container to get an idea of what is happening inside. Ideally, give it 512MB to 1GB of RAM, and a decent amount of CPU, maybe 0.5 - 1.

djackson-saa commented 5 years ago

Last question.... Does the service create the PDF locally in the docker container or does it send the request to an external service?

BTW I was able to get the PDF generation working locally using the microservice and CLI. So I just have to figure out why running the microservice in an OpenShift container is timing out.

MrSaints commented 5 years ago

Last question.... Does the service create the PDF locally in the docker container or does it send the request to an external service?

The service (weaver) creates the PDF within the container by invoking the CLI tool (athenapdf). External requests are only made to download the target HTML, and to upload to S3 if the option for this is enabled. It might be worth checking if the service is able to talk to the internet.

If you have any more Qs, feel free to reach out to me directly on os@fyianlai.com. Happy to help.