Closed Ajithbalakrishnan closed 4 years ago
@Ajithbalakrishnan I believe you behaved a typo on '127.0.0.0:1080", which should be '127.0.0.1:1080
@sczhengyabin Thanks for your quick comment . But i have tried every combination and i got the same answer.
`python3 image_downloader.py --engine Google --driver chrome_headless --max-number 100 --output ./images --proxy_socks5 127.0.0.1:1080 apple
Scraping From Google Image Search ...
Keywords: apple Number: 100 Face Only: False Safe Mode: False Query URL: https://www.google.com/search?tbm=isch&hl=en&q=apple&safe=off /home/ajith/miniconda3/lib/python3.7/site-packages/selenium-4.0.0a5-py3.7.egg/selenium/webdriver/remote/webdriver.py:640: UserWarning: find_elementsby commands are deprecated. Please use find_elements() instead warnings.warn("find_elementsby commands are deprecated. Please use find_elements() instead") Find 0 images.
== 0 out of 0 crawled images urls will be used.
Finished.`
I tried the same with GUI also. But got the same results.
@Ajithbalakrishnan I can download images using exact the same args as yours. It's more likely to be a network issue. Maybe you network is too slow or proxy server internal error. From my tests, if my network has issue with google webs, I will get the exact same outputs as what your commented.
@sczhengyabin I have proper network. But am woking on ubuntu with anaconda environment. I hopes that will not be a problem. I installed the requiremnets through pip.
@Ajithbalakrishnan Try using chrome mode. Which you can see visual actions in chrome browser to see where goes wrong.
@sczhengyabin I tried chrome mode in GUI. Please watch the result. Chrome promted for a second. But it went off. I checked the chrome driver also. Versin also same only.
@Ajithbalakrishnan no clue yet. Does Bing engine works?
@sczhengyabin Nope. Same result. Chrome is not showing that search results. I checked the internet. I have good network.
@sczhengyabin Please share the dependancies and its versions that u have used.
@Ajithbalakrishnan
requests==2.18.4
selenium==3.141.0
PyQt5==5.14.2
generated using pipreqs
Seems to me still a network issue, at least for this project.
To verify, you can setup proxy using 'proxychains', rather than the proxy option in this project.
# config in /etc/proxychains.conf
proxychains python3 image_downloader.py ...
`proxychains python3 image_downloader.py --engine Google --driver chrome_headless --max-number 100 --output ./images --proxy_socks5 127.0.0.1:1080 apple ProxyChains-3.1 (http://proxychains.sf.net)
Scraping From Google Image Search ...
Keywords: apple Number: 100 Face Only: False Safe Mode: False Query URL: https://www.google.com/search?tbm=isch&hl=en&q=apple&safe=off |S-chain|-<>-127.0.0.1:1080-<--timeout |DNS-request| localhost |S-chain|-<>-127.0.0.1:1080-<--timeout |DNS-response|: localhost does not exist |DNS-request| localhost |S-chain|-<>-127.0.0.1:1080-<--timeout |DNS-response|: localhost does not exist ` I am adding my proxychains.config file below.
I tried to change the line "socks4 127.0.0.1 9050" in proxychain config file to 127 0 0 1 1080. But no use.
@Ajithbalakrishnan
proxychains conf should be
socks5 127.0.0.1 1080
if you can use proxychains to downloads other things, e.g. apt-get, then it's an issue with Image-Downloader, other wise it's definitely something wrong with your socks5 proxy configuration.
@sczhengyabin Its working now. I made some changes in /etc/proxychains config file.
Then i have installed Tor,pysocks in my environment.
sudo apt-get install tor
pip install PySocks
As the sock5 port has been changed, so command will be
python3 image_downloader.py --engine Google --driver chrome_headless --max-number 100 --output ./images/kerlaflood --proxy_socks5 127.0.0.1:9050 kerlaflood2018
Hopes this might helpful for others. Sorry for wasting your valuable time.
@Ajithbalakrishnan It's ok, as long as the problem is solved.
fwiw I have a similar issue but only with Google. I think the reason is that google shows a "before you continue to google" page - that's what I quickly see in the interactive Chrome option, before it closes.
Using Bing instead works.
`python3 image_downloader.py --engine Google --driver chrome_headless --max-number 100 --output ./images --proxy_socks5 127.0.0.0:1080 apple
Scraping From Google Image Search ...
Keywords: apple Number: 100 Face Only: False Safe Mode: False Query URL: https://www.google.com/search?tbm=isch&hl=en&q=apple&safe=off /home/ajith/miniconda3/lib/python3.7/site-packages/selenium-4.0.0a5-py3.7.egg/selenium/webdriver/remote/webdriver.py:640: UserWarning: find_elementsby commands are deprecated. Please use find_elements() instead warnings.warn("find_elementsby commands are deprecated. Please use find_elements() instead") Find 0 images.
== 0 out of 0 crawled images urls will be used.
Finished.`
I tried with GUI also. But it doesnt work. Please guid me.