chromedp / chromedp

A faster, simpler way to drive browsers supporting the Chrome DevTools Protocol.
MIT License
10.82k stars 783 forks source link

converting html to pdf is not in correct format. #1396

Open rohit121308 opened 9 months ago

rohit121308 commented 9 months ago

What versions are you running?

$ go list -m github.com/chromedp/chromedp
github.com/chromedp/chromedp  v0.9.3
$ google-chrome --version
119.0.6045.123
$ go version
go version go1.20.6 linux/amd64

What did you do? Include clear steps.

I want to capture html from a url, in which i have dynamic content which is being rendered from api, html which is being captured is correct but the pdf which is being created is not in proper format, please have a look on the code which is being used for it

func main() {
    t1 := time.Now()
    ctx, cancel := chromedp.NewContext(context.Background())
    defer cancel()

    url := "url_from_where_you want to capture html"

    var dataHTML string

    if err := chromedp.Run(ctx,
        chromedp.Navigate(url),
        chromedp.WaitVisible("#root", chromedp.ByQuery),
    ); err != nil {
        log.Fatal(err)
    }
    fmt.Println("Page loaded successfully")

    // Sleep for a brief period to ensure the data rendering is complete (adjust timing as needed)
    time.Sleep(2 * time.Second)

    // Capture the HTML content of the element containing the rendered API data
    if err := chromedp.Run(ctx,
        chromedp.OuterHTML("#root", &dataHTML, chromedp.NodeVisible),
    ); err != nil {
        log.Fatal(err)
    }

    fmt.Println("Captured HTML content", dataHTML)

    pdfBytes := CreatePdfInBytes(ctx, dataHTML)
    _ = pdfBytes 
    t2 := time.Now()
    fmt.Println(t2.Sub(t1))
}

func CreatePdfInBytes(ctx context.Context, html string) []byte {
    var wg sync.WaitGroup
    var pdfBuf []byte

    navigate := chromedp.Navigate("about:blank")

    eventLoader := chromedp.ActionFunc(func(ctx context.Context) error {
        loaderctx, cancel := context.WithCancel(ctx)
        chromedp.ListenTarget(loaderctx, func(event interface{}) {
            if _, ok := event.(*page.EventLoadEventFired); ok {
                wg.Done()
                cancel()
            }
        })
        return nil
    })

    setDocContent := chromedp.ActionFunc(func(ctx context.Context) error {
        frameTree, err := page.GetFrameTree().Do(ctx)
        if err != nil {
            return err
        }
        return page.SetDocumentContent(frameTree.Frame.ID, html).Do(ctx)

    })

    loaderWg := chromedp.ActionFunc(func(ctx context.Context) error {
        wg.Wait()
        return nil
    })

    genPdf := chromedp.ActionFunc(func(ctx context.Context) error {
        buf, _, err := page.PrintToPDF().WithMarginTop(0.3).WithMarginBottom(0.3).WithPrintBackground(true).Do(ctx)
        if err != nil {
            return err
        }
        pdfBuf = buf
        return nil
    })

    wg.Add(1)
    err := chromedp.Run(ctx, navigate, eventLoader, setDocContent, loaderWg, genPdf)
    if err != nil {
        fmt.Println(err)
    }

    file, err := os.Create("abc.pdf")
    if err != nil {
        panic(err)
    }
    defer file.Close()
    // Write the PDF data to the file
    _, err = file.Write(pdfBuf)
    if err != nil {
        panic(err)
    }
    return pdfBuf
}

What did you expect to see?

This is the original image which should be displayed -> https://nimb.ws/seJrxv

What did you see instead?

This is the result which i am getting in the pdf -> https://nimb.ws/q24yRQ, https://nimb.ws/EufcAJ

ZekeLu commented 9 months ago

Please note that when the page is printed to a PDF file, it is rendered with a different width. And obviously, the web page renders differently in different page widths.

rohit121308 commented 9 months ago

@ZekeLu can you please let me know what should i do as i want to create pdf, i tried adjusting page margin's also...

ZekeLu commented 9 months ago

This is more about how to design the page so that the layout is what you want when it's printed to a PDF file. I can not tell what to do without seeing your real page.

Generally, you can preview your design in the print preview dialog. For example, this is what it looks of this page in the print preview dialog:

image

dpanic commented 8 months ago

You can start with this.

@page {
    size: A4;
    margin: 1.1em 0;
}

@media print {
    html,
    body {
        width: 210mm;
        font-size: 14px;
    }
}