QianyanTech / Image-Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
MIT License
2.15k stars 561 forks source link

Error when downloading pics using chrome #46

Open qiuzhewei opened 2 years ago

qiuzhewei commented 2 years ago

Hi, Following error occcurs when I try to run the script.

selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary

Any help will be appreciate!

caop-kie commented 2 years ago

You need to install the chrome browser in your operating system in the first place to use the selenium package.

madiskoivopuu commented 1 year ago

For anyone still having this issue, the problem lies within the regex for parsing the image URL. It gets extra junk in there which breaks the image link. To fix the code, modify google_image_url_from_webpage function in crawler.py to this:

# (line 121)

image_elements = driver.find_elements(By.CLASS_NAME, "islib")
    image_urls = list()
    url_pattern = r"imgurl=\S*?&" # explanation: \S -> match any whitespace character
                                      #                  *? -> match previous token \S between 0 and unlimited times and do so lazily, aka match until the first & and not the last one

    for image_element in image_elements[:max_number]:
        outer_html = image_element.get_attribute("outerHTML")
        re_group = re.search(url_pattern, outer_html)
        if re_group is not None:
            image_url = unquote(re_group.group()[len("imgurl=") : -len("&")])
            image_urls.append(image_url)
    return image_urls