kelciour / batch-download-pictures-from-google-images

7 stars 5 forks source link

No image comes for me #16

Closed DiamondNg closed 11 months ago

DiamondNg commented 1 year ago

import re ... import json ... import requests ... ... from bs4 import BeautifulSoup ... ... headers = { ... "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36" ... } ... ... query = "cat" ... ... r = requests.get("https://www.google.com/search?tbm=isch&q={}&safe=active".format(query), headers=headers, timeout=15) ... ... print('-----------------------') ... print(r.status_code) ... print('-----------------------') ... ... html = r.text ... ... print('-----------------------') ... print(html) ... print('-----------------------') ... ... soup = BeautifulSoup(html, "html.parser") ... rg_meta = soup.find_all("div", {"class": "rg_meta"}) ... metadata = [json.loads(e.text) for e in rg_meta] ... results = [d["ou"] for d in metadata] ... ... if not results: ... regex = re.escape("AF_initDataCallback({") ... regex += r'[^<]?data:[^<]?' + r'([[^<]+])' ... ... for txt in re.findall(regex, html): ... data = json.loads(txt) ... ... try: ... for d in data[31][0][12][2]: ... try: ... results.append(d[1][3][0]) ... except Exception as e: ... pass ... except Exception as e: ... pass ... ... if not results: ... try: ... for d in data[56][1][0][0][1][0]: ... try: ... d = d[0][0]["444383007"] ... results.append(d[1][3][0]) ... except: ... pass ... except: ... pass ... ... print('-----------------------') ... print(' IMAGES ') ... print('-----------------------') ... print('\n\n'.join(results)) ... print('-----------------------') ... print('Found Images:', len(results))

200


<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd">cat - Google Search
Next >

Ulu Bernam Timur, Perak - From your IP address - Learn more


IMAGES


Found Images: 0

DiamondNg commented 1 year ago

On anki version: Version ⁨2.1.63 (f356f177)⁩

kelciour commented 1 year ago

Try updating User-Agent like this https://github.com/kelciour/batch-download-pictures-from-google-images/commit/1b91b8b00563b150d34d510aa91431a8c4c9b883