eternals sources are not used

ogma-sec commented 5 years ago

Hello,

It seems that there is an issue with external source. Using a proxy I cannot see any of these sources used (no google or virus total and no request to robots.txt). I have done several test with Python2 and Python2 on both master/dev version and I cannot succed to make dirhunt use external sources.

`root@kali:/var/www/html# ls -al
total 24
drwxr-xr-x 4 root root 4096 juin 5 20:38 .
drwxr-xr-x 3 root root 4096 juil. 28 2018 ..
drwxr-xr-x 2 www-data www-data 4096 juin 5 18:02 images
-rw-r--r-- 1 www-data www-data 59 juin 5 18:04 index.html
-rw-r--r-- 1 www-data www-data 19 juin 5 20:36 robots.txt
drwxr-xr-x 2 www-data www-data 4096 juin 5 20:19 secret

root@kali:/var/www/html# cat robots.txt
Disallow: /secret/

root@kali:/var/www/html# python /opt/dirhunt/scripts/dirhunt http://127.0.0.1/robots.txt
Welcome to Dirhunt v0.6.0 using Python 2.7.15
[200] http://127.0.0.1/ (Blank page)
Index file found: index.html
[200] http://127.0.0.1/images/ (Generic)
▎ Finished after 2 seconds
No interesting files detected ¯_(ツ)_/¯

root@kali:/var/www/html# python3 /opt/dirhunt/scripts/dirhunt http://127.0.0.1/robots.txt
Welcome to Dirhunt v0.6.0 using Python 3.7.3
[200] http://127.0.0.1/ (Blank page)
Index file found: index.html
[200] http://127.0.0.1/images/ (Generic)
◥ Finished after 2 seconds
No interesting files detected ¯_(ツ)_/¯ `

Nekmo commented 5 years ago

Hi, maybe Dirhunt is not waiting for the source process. Normally with a real (and slower) webite this problem should not happen. Maybe that's why I have not detected it in my tests, thank you.

I will add it to fix it to the next version. :+1:

ogma-sec commented 5 years ago

Hello, thank you for your response. I have executed several other test on real website (my blog) that have a robots.txt and using a BurpSuite Proxy to see request made by the tool and I don't see any robots.txt request (or access to search engin or virus total).

dirhunt https://ogma-sec.fr --proxies http://127.0.0.1:8080

robots.txt URL : https://ogma-sec.fr/robots.txt

At least I think I should see a request on https://ogma-sec.fr/robots.txt made by dirhunt in order to check its content but I don't see it. So I'm not sure its about the execution time of a local website. Crawling my blog is taking at least +40 second to dirhunt.

By the way thank you and congrats for dirhunt, it saves me a lot of time :)

Regards,

Nekmo commented 5 years ago

Screenshot_20190613_221336

I am doing tests with your blog and I see that there is a issue with robots parser.

$ curl http://ogma-sec.fr/robots.txt
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="https://ogma-sec.fr/robots.txt">here</a>.</p>
</body></html>

Nekmo commented 5 years ago

Ok, I'm still doing tests and it seems that it is detecting the paths.

Screenshot_20190613_222606

Nekmo / dirhunt

eternals sources are not used #70