go-rod / rod

A Chrome DevTools Protocol driver for web automation and scraping.
https://go-rod.github.io
MIT License
5k stars 328 forks source link

Exporting the full page as PDF (single page) #1027

Open jamesbuddrige opened 3 months ago

jamesbuddrige commented 3 months ago

Rod Version: v0.114.8

Is there a way for me to be able to export the page as a PDF with only a single PDF page?

So far I've tried modifying the viewport, and setting the paper height:

Example:

    page := browser.Context(ctx).MustPage()
    err := page.SetViewport(&proto.EmulationSetDeviceMetricsOverride{
        Width:  2000,
        Height: 3000,
    })
    _ = proto.EmulationSetEmulatedMedia{
        Media:    "print",
        Features: []*proto.EmulationMediaFeature{},
    }.Call(page)

    // Dynamically determine the content height
    res := page.MustEval(`() => document.body.scrollHeight`)
    if err != nil {
        return err
    }

    contentHeight := res.Num()

    // Convert the content height to inches for PDF generation (assuming 96 DPI)
    paperHeightInches := contentHeight / 96

    marginTop := 0.0
    marginBottom := 0.0
    marginLeft := 0.0
    marginRight := 0.0

    // Generate the PDF with dynamic dimensions
    reportPdf, err := page.PDF(&proto.PagePrintToPDF{
        MarginTop:       &marginTop,
        MarginBottom:    &marginBottom,
        MarginLeft:      &marginLeft,
        MarginRight:     &marginRight,
        PrintBackground: true,
        PaperHeight:     &paperHeightInches, // Convert pixels to inches for PDF generation
        PageRanges:      "1-1",              // Ensure all pages are included
        //PreferCSSPageSize: true,
    })
    if err != nil {
        return err
    }

    bin, err := afero.ReadAll(reportPdf)
    err = afero.WriteFile(fs, "report.pdf", bin, 0644)

This is something I am able to achieve in Puppeteer:

    await page.setViewport({width: 2000, height: 3000});
      const pdfOptions : PDFOptions = {
          printBackground: true,
          height: `${height} px`,
          width: `${width} px`,
          scale,
          pageRanges: "1"
      };
ysmood commented 3 months ago

Per the ISO 32000 standard for PDF, the page dimension limit is 14,400 PDF units in each direction. A PDF unit is 1/72 of an inch so the limit equates to a maximum page size of 200 x 200 inches. (5080 x 5080 mm).

If you have too many pages, it won't help, also the page can have page breakers, they may affect the result.