mazag586 / metagoofil

Automatically exported from code.google.com/p/metagoofil
GNU General Public License v2.0
0 stars 0 forks source link

i can't run 2.2 correctly in china #11

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1.I run this script in China:
> metagoofil.py -d swu.edu.cn -t doc -l 20 -n 20 -o test -f test.html
 output:

******************************************************
*     /\/\   ___| |_ __ _  __ _  ___   ___  / _(_) | *
*    /    \ / _ \ __/ _` |/ _` |/ _ \ / _ \| |_| | | *
*   / /\/\ \  __/ || (_| | (_| | (_) | (_) |  _| | | *
*   \/    \/\___|\__\__,_|\__, |\___/ \___/|_| |_|_| *
*                         |___/                      *
* Metagoofil Ver 2.2                                 *
* Christian Martorella                               *
* Edge-Security.com                                  *
* cmartorella_at_edge-security.com                   *
******************************************************
['doc']

[-] Starting online search...

[-] Searching for doc files, with a limit of 20
        Searching 100 results...
Results: 0 files found
Starting to download 20 of them:
----------------------------------------

processing
user
email

[+] List of users found:
--------------------------

[+] List of software found:
-----------------------------

[+] List of paths and servers found:
---------------------------------------

[+] List of e-mails found:
----------------------------

2. I tried to modify the file: discovery/googlesearch.py
change:   
self.server="www.google.com"
self.hostname="www.google.com"
to:
self.server="www.google.com.hk"
self.hostname="www.google.com.hk"

Re-run step1,
output:
....
['doc']

[-] Starting online search...

[-] Searching for doc files, with a limit of 20
_

This time, the screen output to stop in here and can not continue to go down.(I 
do not know if you can understand, i'm sorry for my poor English!)

I debugged the code and found this script execution is blocked here,i don't 
know what's happen
discovery/googlesearch.py:27  self.results = h.getfile().read()

It looks like google to return too many results

3.so I adjusted the page size parameter: 
 discovery/googlesearch.py:16 
self.quantity="100"  ===> self.quantity="10" 
 discovery/googlesearch.py:46 
self.counter+=100   ===>  self.counter+=10

and I also modified this point 
discovery/googlesearch.py:27 
     self.results = h.getfile().read()
     h.close() #Add this sentence seems to be useful

Re-run step1, Sometimes it works, sometimes the same as before

What is the expected output? What do you see instead?

it does not work very well

What version of the product are you using? On what operating system?
metagoofil 2.2 windows7 python2.6

Please provide any additional information below.

if the script run successfully, the results file path list contains some 
errors:
eg :
[1/20] /webhp?hl=en-HK
         [x] Error downloading /webhp?hl=en-HK
[12/20] /support/websearch/bin/answer.py?answer=134479
        [x] Error downloading /support/websearch/bin/answer.py?answer=134479
....

my solution is :
myparser.py:43 
#reg_urls = re.compile('<a href="(.*?)"')
reg_urls = re.compile('<a href="[^">]*?/url\?q=([^">]*?)&amp;sa=U.*?"')

 The result looks no problem, I do not know any other way, I do not want to change it again.

Original issue reported on code.google.com by c13...@gmail.com on 23 Sep 2013 at 2:45