QianyanTech / Image-Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
MIT License
2.23k stars 576 forks source link

error #29

Closed daixiangzi closed 4 years ago

daixiangzi commented 4 years ago

raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects, response=resp)

ald2004 commented 4 years ago

最近百度改了,up主要更新了, crawler.py-baidu_get_image_url_using_api-res = requests.get(init_url, proxies=proxies)加个header: headers = { 'Accept-Encoding': 'gzip, deflate, sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8', 'Cache-Control': 'max-age=0', 'Connection': 'keep-alive', } init_url="https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&lm=7&fp=result&ie=utf-8&oe=utf-8&st=-1&word=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&queryWord=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&face=0&pn=0&rn=30" -195 res = requests.get(init_url,proxies=proxies) +196 res = requests.get(init_url,proxies=proxies,headers=headers)

mapattacker commented 4 years ago

最近百度改了,up主要更新了, crawler.py-baidu_get_image_url_using_api-res = requests.get(init_url, proxies=proxies)加个header: headers = { 'Accept-Encoding': 'gzip, deflate, sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (X11; Linux x8664) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/_;q=0.8', 'Cache-Control': 'max-age=0', 'Connection': 'keep-alive', } init_url="https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&lm=7&fp=result&ie=utf-8&oe=utf-8&st=-1&word=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&queryWord=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&face=0&pn=0&rn=30" -195 res = requests.get(init_url,proxies=proxies) +196 res = requests.get(init_url,proxies=proxies,headers=headers)

unfortunately it does not work for me...


Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.

== 0 out of 0 crawled images urls will be used.```
mapattacker commented 4 years ago

Ok, there's another line at 215 that needs to be changed. So overall this will work:

-195 res = requests.get(init_url, proxies=proxies)
+195 res = requests.get(init_url, proxies=proxies, headers=headers)

-215 response = requests.get(url, proxies=proxies)
+215 response = requests.get(url, proxies=proxies, headers=headers) 
Yueziyu commented 4 years ago

最近百度改了,up主要更新了, crawler.py-baidu_get_image_url_using_api-res = requests.get(init_url, proxies=proxies)加个header: headers = { 'Accept-Encoding': 'gzip, deflate, sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (X11; Linux x8664) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/_;q=0.8', 'Cache-Control': 'max-age=0', 'Connection': 'keep-alive', } init_url="https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&lm=7&fp=result&ie=utf-8&oe=utf-8&st=-1&word=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&queryWord=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&face=0&pn=0&rn=30" -195 res = requests.get(init_url,proxies=proxies) +196 res = requests.get(init_url,proxies=proxies,headers=headers)

unfortunately it does not work for me...

Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.

== 0 out of 0 crawled images urls will be used.```

大佬麻烦问一下你这个改动是加在哪个地方啊?

mapattacker commented 4 years ago

最近百度改了,up主要更新了, crawler.py-baidu_get_image_url_using_api-res = requests.get(init_url, proxies=proxies)加个header: headers = { 'Accept-Encoding': 'gzip, deflate, sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (X11; Linux x8664) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/_;q=0.8', 'Cache-Control': 'max-age=0', 'Connection': 'keep-alive', } init_url="https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&lm=7&fp=result&ie=utf-8&oe=utf-8&st=-1&word=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&queryWord=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&face=0&pn=0&rn=30" -195 res = requests.get(init_url,proxies=proxies) +196 res = requests.get(init_url,proxies=proxies,headers=headers)

unfortunately it does not work for me...

Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.

== 0 out of 0 crawled images urls will be used.```

大佬麻烦问一下你这个改动是加在哪个地方啊?

Refer to my earlier add-on, line 215 also need to change

-215 response = requests.get(url, proxies=proxies)
+215 response = requests.get(url, proxies=proxies, headers=headers) 
sczhengyabin commented 4 years ago

Fixed in 7013bfdceb78c1d7ad52da9e17b9b361811c389a @ald2004 @mapattacker Thanks for the fix code.