s0md3v / Photon

Incredibly fast crawler designed for OSINT.
GNU General Public License v3.0
10.96k stars 1.49k forks source link

error when scanning IP #75

Closed sethsec closed 6 years ago

sethsec commented 6 years ago

Line 169 errors out if you run photon against an IP. Easiest fix might be to just add a try/except, but there is prob a more elgant solution.

I'm pretty sure this was working before.

root@kali:/opt/Photon# python /opt/Photon/photon.py -u http://192.168.0.213:80
      ____  __          __
     / __ \/ /_  ____  / /_____  ____
    / /_/ / __ \/ __ \/ __/ __ \/ __ \
   / ____/ / / / /_/ / /_/ /_/ / / / /
  /_/   /_/ /_/\____/\__/\____/_/ /_/ v1.1.1

Traceback (most recent call last):
  File "/opt/Photon/photon.py", line 169, in <module>
    domain = get_fld(host, fix_protocol=True) # Extracts top level domain out of the host
  File "/usr/local/lib/python2.7/dist-packages/tld/utils.py", line 387, in get_fld
    search_private=search_private
  File "/usr/local/lib/python2.7/dist-packages/tld/utils.py", line 339, in process_url
    raise TldDomainNotFound(domain_name=domain_name)
tld.exceptions.TldDomainNotFound: Domain 192.168.0.213 didn't match any existing TLD name!
s0md3v commented 6 years ago

Thanks for reporting, I have confirmed this bug and a patch will be applied in the next release.

s0md3v commented 6 years ago

Can you please confirm the patch?

sethsec commented 6 years ago

works again! thanks!

noraj commented 5 years ago

@s0md3v It doesn't work with the last version with python3 and python2.

$ python3 photon.py --url http://x.x.x.x                       
      ____  __          __
     / __ \/ /_  ____  / /_____  ____
    / /_/ / __ \/ __ \/ __/ __ \/ __ \
   / ____/ / / / /_/ / /_/ /_/ / / / /
  /_/   /_/ /_/\____/\__/\____/_/ /_/ v1.1.4

Traceback (most recent call last):
  File "photon.py", line 187, in <module>
    domain = topLevel(main_url)
  File "photon.py", line 183, in topLevel
    ext = tld.get_tld(host, fix_protocol=True)
  File "/usr/lib/python3.7/site-packages/tld/utils.py", line 434, in get_tld
    search_private=search_private
  File "/usr/lib/python3.7/site-packages/tld/utils.py", line 339, in process_url
    raise TldDomainNotFound(domain_name=domain_name)
tld.exceptions.TldDomainNotFound: Domain x.x.x.x didn't match any existing TLD name!

$ python2 photon.py --url http://x.x.x.x
Traceback (most recent call last):
  File "photon.py", line 9, in <module>
    import tld
ImportError: No module named tld

Same with http://x.x.x.x:80.

My installation works with domains but the web server I target has only an IP address.

s0md3v commented 5 years ago

@noraj The error says the "tld" module is missing, you have to install it.

noraj commented 5 years ago

@s0md3v yes for python2. But look at the other message for python3. There are 2 commands on my code block.

noraj commented 5 years ago

I saw in https://github.com/s0md3v/Photon/commit/b1c6a82b9f443965c6a335cb84dadc1982d9bee5:

def topLevel(url):
    try:
        toplevel = tld.get_fld(host, fix_protocol=True)
    except tld.exceptions.TldDomainNotFound:
        toplevel = urlparse(main_url).netloc
    return toplevel
domain = topLevel(main_url)

But urlparse is python2, python3 uses urllib.parse see https://github.com/FriendCode/gittle/issues/49

I think it worked for @sethsec because he was using python2.

update: I think host = urlparse(main_url).netloc # Extracts host out of the url can't work with python3.

noraj commented 5 years ago

So I installed python2-tld and now I have the same issue as for python3.

python3 photon.py --url http://x.x.x.x
      ____  __          __
     / __ \/ /_  ____  / /_____  ____
    / /_/ / __ \/ __ \/ __/ __ \/ __ \
   / ____/ / / / /_/ / /_/ /_/ / / / /
  /_/   /_/ /_/\____/\__/\____/_/ /_/ v1.1.4

Traceback (most recent call last):
  File "photon.py", line 187, in <module>
    domain = topLevel(main_url)
  File "photon.py", line 183, in topLevel
    ext = tld.get_tld(host, fix_protocol=True)
  File "/usr/lib/python3.7/site-packages/tld/utils.py", line 434, in get_tld
    search_private=search_private
  File "/usr/lib/python3.7/site-packages/tld/utils.py", line 339, in process_url
    raise TldDomainNotFound(domain_name=domain_name)
tld.exceptions.TldDomainNotFound: Domain x.x.x.x didn't match any existing TLD name!

python2 photon.py --url http://x.x.x.x
      ____  __          __
     / __ \/ /_  ____  / /_____  ____
    / /_/ / __ \/ __ \/ __/ __ \/ __ \
   / ____/ / / / /_/ / /_/ /_/ / / / /
  /_/   /_/ /_/\____/\__/\____/_/ /_/ v1.1.4

Traceback (most recent call last):
  File "photon.py", line 187, in <module>
    domain = topLevel(main_url)
  File "photon.py", line 183, in topLevel
    ext = tld.get_tld(host, fix_protocol=True)
  File "/usr/lib/python2.7/site-packages/tld/utils.py", line 434, in get_tld
    search_private=search_private
  File "/usr/lib/python2.7/site-packages/tld/utils.py", line 339, in process_url
    raise TldDomainNotFound(domain_name=domain_name)
tld.exceptions.TldDomainNotFound: Domain x.x.x.x didn't match any existing TLD name!
noraj commented 5 years ago

I reproduced the minimal scenario

import tld
import urllib

try:
    import concurrent.futures
    from urllib.parse import urlparse # for python3
    python2, python3 = False, True
except ImportError:
    from urlparse import urlparse # for python2
    python2, python3 = True, False

main_url = 'http://x.x.x.x'

host = urlparse(main_url).netloc # Extracts host out of the url

def topLevel(url):
    ext = tld.get_tld(host, fix_protocol=True)
    toplevel = '.'.join(urlparse(main_url).netloc.split('.')[-2:]).split(ext)[0] + ext
    return toplevel

domain = topLevel(main_url)

PS: sorry now I see where does the urlparse comes from.

noraj commented 5 years ago

It must has been a change in tld module:

$ python
Python 3.7.0 (default, Sep 15 2018, 19:13:07) 
[GCC 8.2.1 20180831] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import urlparse
>>> import tld
>>> main_url = 'http://x.x.x.x'
>>> host = urlparse(main_url).netloc
>>> host
'x.x.x.x'
>>> ext = tld.get_tld(host, fix_protocol=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/site-packages/tld/utils.py", line 434, in get_tld
    search_private=search_private
  File "/usr/lib/python3.7/site-packages/tld/utils.py", line 339, in process_url
    raise TldDomainNotFound(domain_name=domain_name)
tld.exceptions.TldDomainNotFound: Domain x.x.x.x didn't match any existing TLD name!
noraj commented 5 years ago

@s0md3v Can you re-open until it is fixed?

s0md3v commented 5 years ago

Just give me 3 minutes and 43 seconds.

s0md3v commented 5 years ago

Can you check if this patch is working?

noraj commented 5 years ago

@s0md3v This works but instead of a try/catch don't you want to do a if/else where you will apply an IP address regex on host?

https://www.regular-expressions.info/ip.html

Other ways, i don't know if it is better https://stackoverflow.com/questions/319279/how-to-validate-ip-address-in-python

s0md3v commented 5 years ago

Because the tld library matches the input against a list of hardcoded top level domains to extract the host. So if a top level domain is not present in the list or anything else bad happens, we will use urlparse which uses regular expressions that comply with RFC to extract host and other url components. This combination won't fail in any given case as long as the input is valid URL. Give me a good reason to use the approach you suggested and I will definitely implement it.

noraj commented 5 years ago

I have no good reason, it was just a suggestion.

Pete08666 commented 4 years ago

Use pip3 or pip3.8 install tld

wasimroxx118 commented 3 years ago

Try with python3 Photon.py. Mine worked