kohheepeace / mr-pdf

Generate PDF for document website.
https://www.npmjs.com/package/mr-pdf
MIT License
125 stars 41 forks source link

Race condition with loading images #37

Open mattisdada opened 3 years ago

mattisdada commented 3 years ago

Currently if there is several large images there is a race condition to loading while generating the PDF. This may mean some images are only partially loaded.

I've got a workaround that I've applied locally, but it would be good for a native solution:

    await page.evaluate(async () => {
        window.scrollBy(0, window.innerHeight);
        const selectors = Array.from(document.querySelectorAll("img"));
        await Promise.all(selectors.map(img => {
            if (img.complete) return;
            return new Promise((resolve, reject) => {
                img.addEventListener('load', resolve);
                img.addEventListener('error', reject);
            });
        }));
    });

This is applied after the content has been injected back into page but before final pdf rendering...

A blind timeout would also work, but this works reasonably well for me so far

juniorbotelho commented 3 years ago

What do you think of this approach? https://github.com/kohheepeace/mr-pdf/pull/36

mattisdada commented 3 years ago

I suspect that this will still not reliably work. It will depend on the performance of the cache, which was the problem I had.

The document I was generating was pretty image heavy and although the cache seemed to be warm during the navigation portion, when the final PDF was being rendered (because it reinjects all the content at the end of the process) when puppeteer was generating the final PDF they would be inconsistently loaded. But I am no Puppeteer expert so I really cannot be confident in that guess.....

Maybe the performance of the cache is actually good and the current implementation of waitfor was not working quite right. In my node_modules copy this was used during the page nav:

console.log();
console.log(chalk.cyan(`Retrieving html from ${nextPageURL}`));
console.log();
// Go to the page specified by nextPageURL
await page.goto(`${nextPageURL}`, {
    waitUntil: 'networkidle0',
    timeout: 0,
});
Beluk commented 2 years ago

With v1.0.8 images still have problems. I tried also the switch --waitForRender but then an error gets raised: image