ChromeDevTools / devtools-protocol

Chrome DevTools Protocol
https://chromedevtools.github.io/devtools-protocol/
BSD 3-Clause "New" or "Revised" License
1.15k stars 226 forks source link

How to get a stream using Page.printToPDF method? #308

Open baofeidyz opened 9 months ago

baofeidyz commented 9 months ago

I am attempting to convert HTML to a PDF file, but I'm encountering issues due to excessively large image data, causing the process to slow down. I have tried using the Page.printToPDF method with the transferMode set to ReturnAsStream parameters.

However, the value of the result['stream'] is consistently '1', and I am unsure about why this is happening and how to resolve it. Any assistance would be greatly appreciated.

My Chrome version is: 121.0.6167.85 (x86_64) My system OS version is: macOS 13.5.2

I am using Python with Selenium, and the code is as follows:

import time

from selenium import webdriver

options = webdriver.ChromeOptions()
# options.add_argument("--headless=new")
# options.add_argument("--disable-gpu")
driver = webdriver.Chrome(options)
driver.get("https://nodejs.org/api/fs.html")
scroll_distance = 200
scroll_interval = 0.1
current_scroll_position = driver.execute_script("return window.scrollY;")
num_scrolls = int(
    (driver.execute_script("return document.body.scrollHeight;") - current_scroll_position) / scroll_distance)

for i in range(num_scrolls):
    print(f'scroll pages {i + 1}/{num_scrolls}')
    driver.execute_script(f"window.scrollBy(0, {scroll_distance});")
    time.sleep(scroll_interval)

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
# https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-printToPDF
result = driver.execute_cdp_cmd(
    "Page.printToPDF",
    {
        "printBackground": True,
        "generateTaggedPDF": True,
        "transferMode": "ReturnAsStream"
    })
pdf_stream = result['stream']

with open('demo.pdf', 'ab') as pdf_file:
    chunk_size = 1024
    for chunk in pdf_stream:
        pdf_file.write(chunk)