omkarcloud / botasaurus

The All in One Framework to build Awesome Scrapers.
https://www.omkar.cloud/botasaurus/
MIT License
1.16k stars 104 forks source link

Can not download driver when running parallel #67

Closed CryptoDinh closed 1 month ago

CryptoDinh commented 4 months ago

I run sample code and error has occurred:

selenium.common.exceptions.WebDriverException: Message: 'chromedriver-122' executable needs to be in PATH. Please see https://chromedriver.chromium.org/home
I think that system does not wait until downloading driver finish so it raise exception.

Here Code example:

from botasaurus import *

@browser(parallel=bt.calc_max_parallel_browsers, block_resources=True, block_images=True, data=["https://www.yahoo.com/", "https://www.google.com", "https://stackoverflow.com/"])
def scrape_heading_task(driver: AntiDetectDriver, data):
    # print("metadata:", metadata)
    print("data:", data)
    # Navigate to the Omkar Cloud website
    driver.get(data)

    # Retrieve the heading element's text
    heading = driver.text("h1")
    title = driver.title

    # Save the data as a JSON file in output/scrape_heading_task.json
    return {
        "heading": heading,
        "title": title
    }

if __name__ == '__main__':
    scrape_heading_task()
Chetan11-dev commented 1 month ago

Please run the following commands:

python -m pip install bota botasaurus_api botasaurus_driver bota botasaurus-proxy-authentication botasaurus_server --upgrade

And read the documentation at https://github.com/omkarcloud/botasaurus.