opsdisk / pagodo

pagodo (Passive Google Dork) - Automate Google Hacking Database scraping and searching
GNU General Public License v3.0
2.83k stars 495 forks source link

EXCEPTION: HTTP Error 429: Too Many Requests #55

Closed halimB8 closed 3 years ago

halimB8 commented 3 years ago


I have configured 4 Tor proxies and my proxychain4 configuration looks like that:

chain_len = 1
remote_dns_subnet 224
tcp_read_time_out 15000
tcp_connect_time_out 8000
socks4 9050
socks4 9060
socks4 9062
socks4 9064

and I run Pogodo using this command :

proxychains4 python3 pagodo.py -d domain.com -g dorks/files_containing_juicy_info.dorks -l 50 -s -e 60.0 -j 1.1

But I am getting HTTP Error since the first try :

[proxychains] config file found: /etc/proxychains4.conf
[proxychains] preloading /usr/lib/x86_64-linux-gnu/libproxychains.so.4
[proxychains] DLL init: proxychains-ng 4.14
[*] Initiation timestamp: 20210803_101300
[*] Search ( 1 / 938 ) for Google dork [ site:******.com intitle:"Ganglia" "Cluster Report for" ] and waiting 120.85752787276157 seconds between searches using User-Agent 'Mozilla/5.0 (iPad; U; CPU OS 3_2_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B500 Safari/53'
[proxychains] Round Robin chain  ...  ...  www.google.com:443  ...  OK
[proxychains] Round Robin chain  ...  ...  www.google.com:443  ...  OK
[proxychains] Round Robin chain  ...  ...  www.google.com:443  ...  OK
[-] Error with dork: intitle:"Ganglia" "Cluster Report for"
[-] EXCEPTION: HTTP Error 429: Too Many Requests
[*] Google is blocking you, looks like you need to spread out the Google searches.  Don't know how to utilize SSH and dynamic socks proxies?  Do yourself a favor and pick up a copy of The Cyber Plumber's Handbook and interactive lab (https://gumroad.com/l/cph_book_and_lab) to learn all about Secure Shell (SSH) tunneling, port redirection, and bending traffic like a boss.
[*] Search ( 2 / 938 ) for Google dork [ site:*****.com allinurl:/examples/jsp/snp/snoop.jsp ] and waiting 122.82545531944587 seconds between searches using User-Agent 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20100723 SUSE/3.6.8-0.1.1 Firefox/3.6.8'
[proxychains] Round Robin chain  ...  ...  www.google.com:443  ...  OK

Could you tell me please how can I bypass these errors ? Best regards

opsdisk commented 3 years ago

Thanks for submitting an issue @halimB8

It's hard to pinpoint exactly what's happening and if the traffic is being properly routed. Here are my thoughts:

I'm leaning towards a misconfigured proxy setup and not an issue with pagodo, but for now, can you provide how you are setting up the Tor proxies? That would assist me in troubleshooting.

halimB8 commented 3 years ago

Thanks for your answer @opsdisk,

chain_len = 1
remote_dns_subnet 224
tcp_read_time_out 15000
tcp_connect_time_out 8000
socks4 9050
socks4 9060
socks4 9062
socks4 9064
[proxychains] config file found: /etc/proxychains4.conf
[proxychains] preloading /usr/lib/x86_64-linux-gnu/libproxychains.so.4
[proxychains] DLL init: proxychains-ng 4.14
[proxychains] DLL init: proxychains-ng 4.14
[proxychains] DLL init: proxychains-ng 4.14
[proxychains] DLL init: proxychains-ng 4.14
[proxychains] DLL init: proxychains-ng 4.14
[proxychains] Round Robin chain  ...  ...  google.com:80  ...  OK
[proxychains] Round Robin chain  ...  ...  detectportal.firefox.com:80  ...  OK
[proxychains] Round Robin chain  ...  ...  contile.services.mozilla.com:443 [proxychains] DLL init: proxychains-ng 4.14
 ...  OK
[proxychains] Round Robin chain  ...  ...  www.google.com:443 [proxychains] DLL init: proxychains-ng 4.14
 ...  OK
[proxychains] Round Robin chain  ...  ...  push.services.mozilla.com:443 [proxychains] DLL init: proxychains-ng 4.14
 ...  OK
[proxychains] Round Robin chain  ...  ...  incoming.telemetry.mozilla.org:443  ...  OK
[proxychains] Round Robin chain  ...  ...  firefox.settings.services.mozilla.com:443  ...  OK
[proxychains] Round Robin chain  ...  ...  incoming.telemetry.mozilla.org:443  ...  OK
[proxychains] Round Robin chain  ...  ...  r3.o.lencr.org:80  ...  OK
[proxychains] Round Robin chain  ...  ...  ocsp.pki.goog:80  ...  OK
[proxychains] Round Robin chain  ...  ...  detectportal.firefox.com:80  ...  OK
[proxychains] Round Robin chain  ...  ...  ocsp.digicert.com:80  ...  OK
[proxychains] Round Robin chain  ...  ...  ocsp.digicert.com:80  ...  OK
[proxychains] Round Robin chain  ...  ...  detectportal.firefox.com:80  ...  OK
[proxychains] Round Robin chain  ...  ...  ocsp.digicert.com:80  ...  OK
[proxychains] Round Robin chain  ...  ...  www.gstatic.com:443  ...  OK
[proxychains] Round Robin chain  ...  ...  www.gstatic.com:443  ...  OK
[proxychains] Round Robin chain  ...  ...  ocsp.pki.goog:80  ...  OK
[proxychains] Round Robin chain  ...  ...  adservice.google.com:443  ...  OK
[proxychains] Round Robin chain  ...  ...  googleads.g.doubleclick.net:443  ...  OK
[proxychains] Round Robin chain  ...  ...  www.google.com:443  ...  OK
[proxychains] Round Robin chain  ...  ...  www.google.com:443  ...  OK
[proxychains] Round Robin chain  ...  ...  ocsp.pki.goog:80  ...  OK
opsdisk commented 3 years ago

Thanks for that info @halimB8 I'll see if I can replicate it on my end. It may be a week or two though until I can get to it.

opsdisk commented 3 years ago

Hey @halimB8 - in the middle of rewriting a new Google search library right now for pagodo, but wanted to have you check something:

"Even after having HTTP 429 errors I still can access google using my browser" - have you tried executing a Google search through the browser after that? I can browse to google.com all day on Tor, but anytime I try and search, I'll get the reCAPTCHA screen. With the Tor exit nodes being public, I think Google uses that and will squash most searches through Tor without a reCAPTCHA verification (which pagodo can't currently do).

halimB8 commented 3 years ago

Hey @opsdisk Thanks for your answers,

I just tried again and now before the HTTP 429 error, I am getting an error with dork like that :

[-] Error with dork: index.of.secret
[-] EXCEPTION: HTTP Error 429: Too Many Requests
[*] Google is blocking you, looks like you need to spread out the Google searches.  Don't know how to utilize SSH and dynamic socks proxies?  Do yourself a favor and pick up a copy of The Cyber Plumber's Handbook and interactive lab (https://gumroad.com/l/cph_book_and_lab) to learn all about Secure Shell (SSH) tunneling, port redirection, and bending traffic like a boss.

and yes I tried to search on google on my browser after getting that error and it didn't ask me for a reCAPTCHA, I even took some dorks from sensitive_directories.dorks and run them manually in my browser and worked fine

opsdisk commented 3 years ago

As a heads up @halimB8 , I released yagooglesearch yesterday. I rewrote the entire underlying library that powers pagodo. It supports HTTP 429 auto detection/backoff and has native proxy support (https://github.com/opsdisk/yagooglesearch#http-and-socks5-proxy-support). pagodo v2 should be released shortly!

opsdisk commented 3 years ago

Just released v2! https://github.com/opsdisk/pagodo/releases/tag/v2.0.0

Let me know if you're still running into this issue.

halimB8 commented 3 years ago

Thanks @opsdisk for this fast and great work. So I just tried again with the same config, and this time I got an error and then a warning that google is blocking my IP, and then it sleeps for 60min Here is the command I run:

proxychains4 python3 pagodo.py -d myDomain.com -g dorks/files_containing_juicy_info.dorks -o -s

and here is what I got :

[proxychains] config file found: /etc/proxychains4.conf
[proxychains] preloading /usr/lib/x86_64-linux-gnu/libproxychains.so.4
[proxychains] DLL init: proxychains-ng 4.14
2021-09-01 17:24:25,472 [MainThread  ] [INFO] Initiation timestamp: 2021-09-01T17:24:25.472163
2021-09-01 17:24:25,472 [MainThread  ] [INFO] Search ( 1 / 942 ) for Google dork [ site:*******.com intitle:"Ganglia" "Cluster Report for" ] using User-Agent 'Mozilla/5.0 (X11; U; Linux i686; de; rv: Gecko/20090722 Gentoo Firefox/3.5.1' through proxy ''
2021-09-01 17:24:25,472 [MainThread  ] [INFO] Requesting URL: https://www.google.com/
[proxychains] Round Robin chain  ...  ...  www.google.com:443 <--socket error or timeout!
2021-09-01 17:24:40,489 [MainThread  ] [ERROR] Error with dork: intitle:"Ganglia" "Cluster Report for"
2021-09-01 17:24:40,489 [MainThread  ] [ERROR] EXCEPTION: HTTPSConnectionPool(host='www.google.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f41145ed970>: Failed to establish a new connection: [Errno 111] Connection refused'))
2021-09-01 17:24:40,489 [MainThread  ] [INFO] Sleeping 53.6 seconds before executing the next dork search...
2021-09-01 17:25:34,143 [MainThread  ] [INFO] Search ( 2 / 942 ) for Google dork [ site:*****.com allinurl:/examples/jsp/snp/snoop.jsp ] using User-Agent 'Opera/9.80 (Windows NT 5.2; U; en) Presto/2.2.15 Version/10.00' through proxy ''
2021-09-01 17:25:34,144 [MainThread  ] [INFO] Requesting URL: https://www.google.com/
[proxychains] Round Robin chain  ...  ...  www.google.com:443  ...  OK
2021-09-01 17:25:34,922 [MainThread  ] [INFO] Stats: start=0, num=100, total_valid_links_found=0 / max_search_result_urls_to_return=100
2021-09-01 17:25:34,923 [MainThread  ] [INFO] Requesting URL: https://www.google.com/search?hl=en&q=site%3A*******.com+allinurl%3A%2Fexamples%2Fjsp%2Fsnp%2Fsnoop.jsp&num=100&btnG=Google+Search&tbs=li:1&safe=off&cr=&filter=0
[proxychains] Round Robin chain  ...  ...  www.google.com:443  ...  OK
2021-09-01 17:25:36,135 [MainThread  ] [WARNING] Google is blocking your IP for making too many requests in a specific time period.
2021-09-01 17:25:36,136 [MainThread  ] [INFO] Sleeping for 60 minutes...
opsdisk commented 3 years ago

For grins, can you try using the native proxy support without proxychains4? I want to determine if it's a proxychains4 or Tor issue.

So instead of prepending the command with proxychains4, use:

python pagodo.py -g dorks.txt -p socks5h://,socks5h://,socks5h://,socks5h://

Unless they were updated, I used the proxies you specified here: https://github.com/opsdisk/pagodo/issues/55#issuecomment-893234914

halimB8 commented 3 years ago

I Think it's tor issue, cause I just run the command you asked me for :

python3 pagodo.py -g dorks/web_server_detection.dorks -d myDOmain.com -p socks5h://,socks5h://,socks5h://,socks5h://

And from the first search I got a warning and it sleeps for 60min as you can see here :

2021-09-02 17:48:07,669 [MainThread  ] [INFO] Initiation timestamp: 2021-09-02T17:48:07.669896
2021-09-02 17:48:07,670 [MainThread  ] [INFO] Search ( 1 / 186 ) for Google dork [ site:myDomain.com "Novell, Inc" WEBACCESS Username Password "Version *.*" Copyright -inurl:help -guides|guide ] using User-Agent 'Opera/9.80 (X11; Linux i686; U; en) Presto/2.5.27 Version/10.60' through proxy 'socks5h://'
2021-09-02 17:48:07,670 [MainThread  ] [INFO] Requesting URL: https://www.google.com/
2021-09-02 17:48:09,667 [MainThread  ] [INFO] Stats: start=0, num=100, total_valid_links_found=0 / max_search_result_urls_to_return=100
2021-09-02 17:48:09,667 [MainThread  ] [INFO] Requesting URL: https://www.google.com/search?hl=en&q=site%3AmyDOmain.com+%22Novell%2C+Inc%22+WEBACCESS+Username+Password+%22Version+%2A.%2A%22+Copyright+-inurl%3Ahelp+-guides%7Cguide&num=100&btnG=Google+Search&tbs=li:1&safe=off&cr=&filter=0
2021-09-02 17:48:12,398 [MainThread  ] [WARNING] Google is blocking your IP for making too many requests in a specific time period.
2021-09-02 17:48:12,399 [MainThread  ] [INFO] Sleeping for 60 minutes...
opsdisk commented 3 years ago

The sleeping is because an HTTP 429 was received by pagodo from Google. I haven't been able to set up a Tor test environment to confirm that it's Tor, but that's still my suspicion.

opsdisk commented 3 years ago

You still want me to keep this issue open @halimB8 ?