hardkoded / puppeteer-sharp

Headless Chrome .NET API
https://www.puppeteersharp.com
MIT License
3.42k stars 446 forks source link

PdfDataAsync incorrectly rendering HTML from Page #2790

Open g0nP opened 1 month ago

g0nP commented 1 month ago

Description

I'm using PuppeteerSharp 20.0.2 on .NET 8 running on a Lambda function. I am getting the Chromium executable (v127) from a Layer.

After Launching a browser and creating a page, I set the Content of the page to a custom HTML template using SetContentAsync(...).

I then call page.GetContentAsync(); to make sure the Content of the page is exactly what I expect, and finally I call page.PdfDataAsync(...) to generate a PDF file.

While PdfDataAsync(...) returns a byte[] with some bytes (674 bytes), the final PDF file just shows empty.

When I run the code locally in Windows, it works. PdfDataAsync(...) generates a byte array of 16776 bytes,

Complete minimal example reproducing the issue

var launcherArgs =
[
        "--disable-background-timer-throttling",
        "--disable-breakpad",
        "--disable-client-side-phishing-detection",
        "--disable-cloud-import",
        "--disable-default-apps",
        "--disable-dev-shm-usage",
        "--disable-extensions",
        "--disable-gesture-typing",
        "--disable-hang-monitor",
        "--disable-infobars",
        "--disable-notifications",
        "--disable-offer-store-unmasked-wallet-cards",
        "--disable-offer-upload-credit-cards",
        "--disable-popup-blocking",
        "--disable-print-preview",
        "--disable-prompt-on-repost",
        "--disable-setuid-sandbox",
        "--disable-speech-api",
        "--disable-sync",
        "--disable-tab-for-desktop-share",
        "--disable-translate",
        "--disable-voice-input",
        "--disable-wake-on-wifi",
        "--disk-cache-size=33554432",
        "--enable-async-dns",
        "--enable-simple-cache-backend",
        "--enable-tcp-fast-open",
        "--hide-scrollbars",
        "--ignore-gpu-blacklist",
        "--media-cache-size=33554432",
        "--metrics-recording-only",
        "--mute-audio",
        "--no-default-browser-check",
        "--no-first-run",
        "--no-pings",
        "--no-sandbox",
        "--no-zygote",
        "--password-store=basic",
        "--prerender-from-omnibox=disabled",
        "--use-gl=angle",
        "--use-angle=swiftshader",
        "--use-mock-keychain",
        "--single-process"
];

var executablePath = @"[PATH TO CHROMIUM]\chromium";

using var browser = await new Launcher().LaunchAsync(new LaunchOptions
{
     Headless = true,
     ExecutablePath = executablePath,
     AcceptInsecureCerts = true,
     Args = launcherArgs
});

using var page = await browser.NewPageAsync();

var html = "<html><head></head><body>Hello World</body></html>";

await page.SetContentAsync(html);

var content = await page.GetContentAsync();

var pdfContent = await page.PdfDataAsync(new PdfOptions { PreferCSSPageSize = true, PrintBackground = true, Landscape = true });

Expected behavior:

byte[] pdfContent has size of 16776 which would indicate the full HTML has been rendered.

Actual behavior:

byte[] pdfContent has size of 674. The resulting PDF files is blank.

Versions

PuppeteerSharp 20.0.2 .NET 8 running on a Lambda function Chromium v127

Additional Information

When running locally, I have to remove the argument "--single-process" otherwise I get an exception regarding a closed WebSocket.

When running on AWS (Amazon Linux 2023), if "--single-process" is missing I get a timeout.