Nekmo / dirhunt

Find web directories without bruteforce
MIT License
1.73k stars 237 forks source link

UnicodeEncodeError 'ascii' codec can't encode characters in position ... #50

Closed Farbdose closed 5 years ago

Farbdose commented 5 years ago

I just ran dirhunt http://gravitytales.com to see what would happen on a site with many links... Looks like dirhunt ran into some non ascii characters:

Traceback (most recent call last):
  File "/usr/local/bin/dirhunt", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/dirhunt/management.py", line 160, in main
    catch(hunt)()
  File "/usr/local/lib/python2.7/dist-packages/dirhunt/exceptions.py", line 33, in wrap
    fn(*args, **kwargs)
  File "/home/user/.local/lib/python2.7/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/.local/lib/python2.7/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/user/.local/lib/python2.7/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/.local/lib/python2.7/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dirhunt/management.py", line 151, in hunt
    catch_keyboard_interrupt(crawler.print_results, crawler.restart)(set(exclude_flags), set(include_flags))
  File "/usr/local/lib/python2.7/dist-packages/dirhunt/utils.py", line 37, in wrap
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dirhunt/crawler.py", line 149, in print_results
    self.echo(result)
  File "/usr/local/lib/python2.7/dist-packages/dirhunt/crawler.py", line 115, in echo
    self.std.write(str(body))
  File "/usr/local/lib/python2.7/dist-packages/dirhunt/processors.py", line 194, in __str__
    body = self.url_line()
  File "/usr/local/lib/python2.7/dist-packages/dirhunt/processors.py", line 94, in url_line
    body += ' {} '.format(self.crawler_url.url.url)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 251-252: ordinal not in range(128)

I don't know which exact site causes this as the last url before the stacktrace changes... so just run it on the base domain and wait until it crashes...

Nekmo commented 5 years ago

This seems to be a bug with Python 2. I'll fix it in the proxy release. Meanwhile you can try running Dirhunt using Python 3.

Thank you for your collaboration!

Farbdose commented 5 years ago

No problem and thanks for the quicks response. I can confirm that it works with python3 (that's some badass real-time support service you got there^^).

Nekmo commented 5 years ago

Fixed on release v0.5.1. You can update to the new version to solve this bug:

pip install -U dirhunt

Thanks! EDIT: I was programming right at this moment :P