arachnys / athenapdf

Drop-in replacement for wkhtmltopdf built on Go, Electron and Docker
MIT License
2.26k stars 187 forks source link

How can I debug "The renderer process has crashed"? #195

Closed slochower closed 5 years ago

slochower commented 5 years ago

I'm consistently getting The renderer process has crashed locally and on Travis. I've tried version 2.16.0 and latest via Docker on macOS and linux hosts. I've tried increasing the delay until 20000. None of these things have prevented the renderer from crashing.

I think this may be related to #152 or #131 but I'm not sure how to debug or get further information about what's going on. Is there a way to get verbose output? I do have a bunch of high-ish resolution PNG files, but I don't think this should be a problem and I don't mind if the resulting PDF is large. Prior to this crash, I had a number of separate PNG files that I combined into single images using ImageMagick.

The HTML source is here (although I am not building from URL, I am building from a local copy of that file) and I'm running Athena on Travis here.

slochower commented 5 years ago

This is reproducible with the live demo on https://github.com/arachnys/athenapdf.

slochower commented 5 years ago

Running with the v3 beta and -D flag, the last few bits of output I see are:

$ docker run --rm -v $(pwd) arachnysdocker/athenapdf:3 -D --dry-run https://slochower.github.io/smirnoff-host-guest-manuscript/

[snip]

2019/05/09 20:19:55 dispatching Page.loadEventFired event: {"method":"Page.loadEventFired","params":{"timestamp":9151.344261}}
2019/05/09 20:19:55 13 sending to chrome. {"id":13,"method":"Runtime.evaluate","params":{"expression":"var print = document.querySelectorAll(\"[rel='stylesheet'][media*='print'], style[media*='print']\");\nvar screen = document.querySelectorAll(\"[rel='stylesheet'][media*='screen'], style[media*='screen']\");\n\nif (print.length === 0) {\n    for (var i = 0, l = screen.length; i \u003c l; i++) {\n        screen[i].removeAttribute(\"media\");\n    }\n}\n\nPromise.resolve();\n","awaitPromise":true}}
2019/05/09 20:19:55 13 dispatching
2019/05/09 20:19:55 14 sending to chrome. {"id":14,"method":"Page.printToPDF","params":{"printBackground":true,"scale":1,"paperWidth":8.5,"paperHeight":11,"marginTop":0.4,"marginBottom":0.4,"marginLeft":0.4,"marginRight":0.4}}
2019/05/09 20:19:56

no event recv bound for: Inspector.targetCrashed
2019/05/09 20:19:56 data: {"method":"Inspector.targetCrashed","params":{}}

athenapdf: error: nil response received
dhimmel commented 5 years ago

For debugging purposes, you should be able to get this replicate this error by pointing athena at https://slochower.github.io/smirnoff-host-guest-manuscript/v/5c9b6058fb44adba5c97a646ebccc801d566393e/. See https://github.com/manubot/rootstock/pull/210#issuecomment-491064216.

dhimmel commented 5 years ago

Prior to this crash, I had a number of separate PNG files that I combined into single images using ImageMagick.

@slochower given that Athena was properly converting the document prior to this change, we should be able to narrow down exactly what's breaking it. Sounds like it's struggling with the new PNG?

slochower commented 5 years ago

@dhimmel That's right. There are a bunch of PNGs now -- about 27, but they are mostly <1 MB (exept for one image at 3.7 MB), so nothing terribly huge. The only other guesses are the number of annotations with hypothesis, but I believe I have eliminated that by building the HTML without any of the plugins.

slochower commented 5 years ago

I manually removed more than half of the images in the HTML by deleting the <figure> stanzas and I did finally get a build. I first tried deleting about ~10 images and still got the renderer to crash. I then deleted a few more, and the renderer still crashed. Finally, I deleted almost all of the images and the build completed. I also deleted a table (by accident). Could this be a memory issue with the renderer?

$   docker run     --rm     --volume `pwd`/output:/converted/     --security-opt seccomp:unconfined     arachnysdocker/athenapdf:2.16.0     athenapdf     --delay=20000     --timeout=20000     manuscript.html manuscript.pdf
ATTENTION: default value of option force_s3tc_enable overridden by environment.
Converted 'file:///converted/manuscript.html' to PDF: 'manuscript.pdf'
PDF Conversion: 26562.884ms

I've attached the diff between version that renders and one that doesn't. diff.txt

slochower commented 5 years ago

I have confirmed adding --shm-size="2g" fixes things. Previously docker stats would report memory usage of ~250 MB out of the 2 GB allocated to docker. With the new command, docker stats reports memory usage around ~700 MB. I think this is the best explanation of the underlying cause: https://github.com/electron/electron/issues/9093

dhimmel commented 5 years ago

Nice debugging @slochower. Where is the 2 GB memory allocation to Docker coming from? Do you want to create a manubot/rootstock PR to specify --shm-size? Seems like we could get away with bigger on Travis.

Also volumes are a potential solution: I see here:

When executing docker run for an image with Chrome or Firefox please either mount -v /dev/shm:/dev/shm or use the flag --shm-size=2g to use the host's shared memory.

slochower commented 5 years ago

Where is the 2 GB memory allocation to Docker coming from?

In my case, I think the default macOS docker memory allocation is 2 GB, but the shared memory shm part is much smaller (default maybe ~64 MB, depending on how outdated my info is). I'm not clear on the benefits of docker run --shm-size="2g" versus mounting /dev/shm. I do know the former works; haven't tested the latter.

MrSaints commented 5 years ago

That's interesting. Because I'd assume https://github.com/arachnys/athenapdf/pull/186 would fix it, but as I am no longer with the team, I am unsure if it was tested. That said, we have had multiple similar reports in the past: https://github.com/arachnys/athenapdf/issues/95#issuecomment-291099432, and in my experience, mounting /dev/shm often results in less crashes.

I'm not sure what your use-case is, but speaking impartially, nowadays, you can often get away with having a serverless converter. That is, a HTML-to-PDF converter running on something like AWS Lambda or Google Cloud Function. Athena is in need of an upgrade, and the existing set-up with Electron adds quite a bit of unnecessary overhead. v3 was supposed to be the way forward, but it is highly unstable, and I do not encourage using it in production.