HollowMan6 / mdbook-pdf

A backend for mdBook written in Rust for generating PDF based on headless chrome and Chrome DevTools Protocol. (用 Rust 编写的 mdBook 后端,基于headless chrome和Chrome开发工具协议生成PDF)
https://crates.io/crates/mdbook-pdf
GNU General Public License v3.0
153 stars 18 forks source link

Configurable timeout #34

Closed Zabuzard closed 10 months ago

Zabuzard commented 11 months ago

Hello there! 👋

I am running into a timeout issue after 10 minutes, i.e. 600 seconds as configured in main.rs#L183.

The book (cannot disclose it) is roughly 500 sites long and has multiple bigger images. I suspect the timeout just being a bit too small due to it working just fine when the content was still less.

Would it be possible to expose the timeout setting as configurable option in the book.toml? That would be awesome 🙂

HollowMan6 commented 10 months ago

I made a commit to make timeout configurable: https://github.com/HollowMan6/mdbook-pdf/commit/c5c48f2235c3ee3bfea805ced83c0a771dac4dd0 Expect to have it in the next release. Now you just need to run cargo install --git https://github.com/HollowMan6/mdbook-pdf, have the timeout in output.pdf of your book.toml, and try again.

BTW, are you sure Chrome didn't crash in your case? 10 minutes of waiting time is not reasonable at all.

Zabuzard commented 10 months ago

The timeout option works 👍 Sadly, still crashing, as you assumed.

trace

Not sure why, I will try to find the commit on my side that made it break. Is there any way to get extended debugging what Chrome was doing or why it failed or perhaps crashed?

HollowMan6 commented 10 months ago

Not sure why, I will try to find the commit on my side that made it break. Is there any way to get extended debugging what Chrome was doing or why it failed or perhaps crashed?

I would guess it's because of an OOM kill since you mentioned that it worked just fine when the content was still less. I'm not sure if there's any method for debugging this (since Chrome crash seldom happens), maybe you can monitor the Chrome process for doing so, or you can directly open the print.html with Chrome and see if it's ok.

Zabuzard commented 10 months ago

I think you are right. I removed all images temporarily and now it works within seconds again. Doing it manually with chrome works just fine though (takes 30 seconds or so).

Is there any way to give the chrome headless process more memory or is my only option reducing the images? Its 400 images, taking 200 MB together in total. Without images, the PDF file is 40 KB big. With just the first 50 of 400 pages (including images), its already 400 MB big.

HollowMan6 commented 10 months ago

I think the things you can do are either reduce the images or increase your host memory/virtual memory/swap size, as oom kill is managed by the system.

Zabuzard commented 10 months ago

Ive cut down the images from 200 MB to 15 MB and now it works again. Im still surprised though that it handles that so poorly. Especially since Chrome can deal with it when printing manually. 🤷

Thanks for your assistance though, very helpful! 👍 🎉

HollowMan6 commented 10 months ago

I guess it's because of the overhead in how headless_chrome sends the generated PDF files back (base64 strings conversion + the WebSockets overhead).

There was an effort at headless_chrome which tries to use pipe instead: https://github.com/rust-headless-chrome/rust-headless-chrome/issues/202 I believe this can greatly improve the oom issue here. If you are interested you can help with that one, no one seems to work on that currently.