ultrafunkamsterdam / undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
https://github.com/UltrafunkAmsterdam/undetected-chromedriver
GNU General Public License v3.0
9.92k stars 1.16k forks source link

[Nodriver] CDP DownloadWillBegin event not working #2060

Open abhash-rai opened 6 days ago

abhash-rai commented 6 days ago

Hello. I just wanted to make a nodriver script that can detect downloads and its status. I tried to print logging "Download Started..." whenever tab download begins but it doesn't print the log which leads me to believe the event in my script is not setup correctly. I would also be grateful if anyone can add the functionality to detect download completion if download was started. Thanks in advance!

import asyncio
import time
import nodriver as uc
from nodriver import cdp

download_status = False

def listen_download(page):
    async def handler(evt):
        global download_status
        download_status = True
        print("Download started...")  # Add logging

    page.add_handler(cdp.browser.DownloadWillBegin, handler)

async def crawl():
    global download_status

    browser = await uc.start(headless=False)

    # Use the main tab
    tab = await browser.get('about:blank')

    listen_download(tab)

    # Navigate to the PDF URL
    pdf_url = "https://www.python.org/ftp/python/3.13.0/python-3.13.0-amd64.exe"
    print(f"Navigating to {pdf_url}...")
    await tab.get(pdf_url)

    # Keep the script running to monitor the download
    print("Monitoring downloads... Press Ctrl+C to exit.")

    while True:
        if download_status:
            print('Hurray')
        await asyncio.sleep(0.2)  # Keep the event loop alive

if __name__ == '__main__':
    uc.loop().run_until_complete(crawl())
ultrafunkamsterdam commented 5 days ago

You need handlers for both cdp.page. DownloadWillBegin and DownloadProgress

abhash-rai commented 4 days ago

Thanks this worked:


import asyncio
import nodriver as uc
from nodriver import cdp

binded_tabs = []
async def bind_handlers(browser):
    global binded_tabs
    while True:
        await asyncio.sleep(0.01)
        for tab in browser.tabs:
            if tab not in binded_tabs:
                tab.add_handler(cdp.page.DownloadWillBegin, lambda event: print('Download event => %s' % event.guid))        
                binded_tabs.append(tab)

async def crawl():    

    browser = await uc.start(headless=False)

    asyncio.create_task(bind_handlers(browser))

    await browser.get("https://www.python.org/ftp/python/3.13.0/python-3.13.0-amd64.exe")
    await browser.get("https://code.visualstudio.com/sha/download?build=stable&os=win32-x64-user", new_tab=True)

    while True:
        await asyncio.sleep(0.2)  # Keep the event loop alive

if __name__ == '__main__':
    uc.loop().run_until_complete(crawl())

However sometimes when clicking a download button it redirects to a new tab entirely from where download will begin. In this case it doesn't detect download.

I want to be able to add handlers to every opened tab current or future. How can I do this? Is there a cdp event for this as well? I checked out cdp.browser and it has DownloadWillBegin event class but when I use cdp.browser.DownloadWillBegin to above code the function to be called on download start event which in this case is a basic lambda logging function is not called.

My aim is to detect download at browser level across every tab currently opened or future tabs.