QianyanTech / Image-Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
MIT License
2.19k stars 572 forks source link

Not downloading any images #24

Closed Ajithbalakrishnan closed 4 years ago

Ajithbalakrishnan commented 4 years ago

`python3 image_downloader.py --engine Google --driver chrome_headless --max-number 100 --output ./images --proxy_socks5 127.0.0.0:1080 apple

Scraping From Google Image Search ...

Keywords: apple Number: 100 Face Only: False Safe Mode: False Query URL: https://www.google.com/search?tbm=isch&hl=en&q=apple&safe=off /home/ajith/miniconda3/lib/python3.7/site-packages/selenium-4.0.0a5-py3.7.egg/selenium/webdriver/remote/webdriver.py:640: UserWarning: find_elementsby commands are deprecated. Please use find_elements() instead warnings.warn("find_elementsby commands are deprecated. Please use find_elements() instead") Find 0 images.

== 0 out of 0 crawled images urls will be used.

Finished.`

I tried with GUI also. But it doesnt work. Please guid me.

sczhengyabin commented 4 years ago

@Ajithbalakrishnan I believe you behaved a typo on '127.0.0.0:1080", which should be '127.0.0.1:1080

Ajithbalakrishnan commented 4 years ago

@sczhengyabin Thanks for your quick comment . But i have tried every combination and i got the same answer.

`python3 image_downloader.py --engine Google --driver chrome_headless --max-number 100 --output ./images --proxy_socks5 127.0.0.1:1080 apple

Scraping From Google Image Search ...

Keywords: apple Number: 100 Face Only: False Safe Mode: False Query URL: https://www.google.com/search?tbm=isch&hl=en&q=apple&safe=off /home/ajith/miniconda3/lib/python3.7/site-packages/selenium-4.0.0a5-py3.7.egg/selenium/webdriver/remote/webdriver.py:640: UserWarning: find_elementsby commands are deprecated. Please use find_elements() instead warnings.warn("find_elementsby commands are deprecated. Please use find_elements() instead") Find 0 images.

== 0 out of 0 crawled images urls will be used.

Finished.`

I tried the same with GUI also. But got the same results.

sczhengyabin commented 4 years ago

image @Ajithbalakrishnan I can download images using exact the same args as yours. It's more likely to be a network issue. Maybe you network is too slow or proxy server internal error. From my tests, if my network has issue with google webs, I will get the exact same outputs as what your commented.

Ajithbalakrishnan commented 4 years ago

@sczhengyabin I have proper network. But am woking on ubuntu with anaconda environment. I hopes that will not be a problem. I installed the requiremnets through pip.

sczhengyabin commented 4 years ago

@Ajithbalakrishnan Try using chrome mode. Which you can see visual actions in chrome browser to see where goes wrong.

Ajithbalakrishnan commented 4 years ago

@sczhengyabin I tried chrome mode in GUI. Please watch the result. Chrome promted for a second. But it went off. I checked the chrome driver also. Versin also same only. Screenshot from 2020-04-26 21-31-30

sczhengyabin commented 4 years ago

@Ajithbalakrishnan no clue yet. Does Bing engine works?

Ajithbalakrishnan commented 4 years ago

@sczhengyabin Nope. Same result. Chrome is not showing that search results. I checked the internet. I have good network.
Screenshot from 2020-04-27 00-39-32

@sczhengyabin Please share the dependancies and its versions that u have used.

sczhengyabin commented 4 years ago

@Ajithbalakrishnan

requests==2.18.4
selenium==3.141.0
PyQt5==5.14.2

generated using pipreqs

Seems to me still a network issue, at least for this project.

To verify, you can setup proxy using 'proxychains', rather than the proxy option in this project.

# config in /etc/proxychains.conf
proxychains python3 image_downloader.py ...
Ajithbalakrishnan commented 4 years ago

`proxychains python3 image_downloader.py --engine Google --driver chrome_headless --max-number 100 --output ./images --proxy_socks5 127.0.0.1:1080 apple ProxyChains-3.1 (http://proxychains.sf.net)

Scraping From Google Image Search ...

Keywords: apple Number: 100 Face Only: False Safe Mode: False Query URL: https://www.google.com/search?tbm=isch&hl=en&q=apple&safe=off |S-chain|-<>-127.0.0.1:1080-<--timeout |DNS-request| localhost |S-chain|-<>-127.0.0.1:1080-<--timeout |DNS-response|: localhost does not exist |DNS-request| localhost |S-chain|-<>-127.0.0.1:1080-<--timeout |DNS-response|: localhost does not exist ` I am adding my proxychains.config file below.

proxychains.zip

I tried to change the line "socks4 127.0.0.1 9050" in proxychain config file to 127 0 0 1 1080. But no use.

sczhengyabin commented 4 years ago

@Ajithbalakrishnan proxychains conf should be socks5 127.0.0.1 1080 if you can use proxychains to downloads other things, e.g. apt-get, then it's an issue with Image-Downloader, other wise it's definitely something wrong with your socks5 proxy configuration.

Ajithbalakrishnan commented 4 years ago

@sczhengyabin Its working now. I made some changes in /etc/proxychains config file.

  1. Strict chain to dynamic chain
  2. added one more line in last socks5 127.0.0.1 9050

Then i have installed Tor,pysocks in my environment.


   sudo apt-get install tor
    pip install PySocks

As the sock5 port has been changed, so command will be

python3 image_downloader.py --engine Google --driver chrome_headless --max-number 100 --output ./images/kerlaflood --proxy_socks5 127.0.0.1:9050 kerlaflood2018

Hopes this might helpful for others. Sorry for wasting your valuable time.

sczhengyabin commented 4 years ago

@Ajithbalakrishnan It's ok, as long as the problem is solved.

lucidBrot commented 1 year ago

fwiw I have a similar issue but only with Google. I think the reason is that google shows a "before you continue to google" page - that's what I quickly see in the interactive Chrome option, before it closes.

Using Bing instead works.