Closed jokki closed 4 years ago
Thanks for creating a pull request. Did you got some examples of responses for which you're getting responses. Also how you're getting a binary file?
It could be for example PDFs or images. I think I've seen font data, like "woff2" (also binary) somewhere too...
Yeah I am talking about that only. How're you getting that data? I am neither downloading pdf nor fonts, but only files inside script
tag?
Ah, I see. I've come to pass all my response data from ZAP/Burp to 'SubDomainzer.py -f
Ah got it. I still doubt on the decoding thing. Although merging your pull request.
Thanks for that.
Background for PR: Suppose you're testing a site that has a CAPTCHA that to some extent blocks tools like SubDomainizer from getting any useful responses. My workaround for this situation so far has been to manually browse the site with an intercepting proxy like ZAP or BurpSuite and manually click myself past the CAPTCHAs. Once I'm happy with my exploration of the site I'll dump all the response data to file and run
python3 SubDomainizer.py -f <dir with files>
on it. That initially failed for me because the response data has binary content that breaks "file.read()" ('t' flag/text). I addederrors='ignore'
to it to resolve that problem.If you've got a better way or any thoughts or ideas about this I'd love to hear it. Thanks for making this tool!
Commit comment: This change only applies to when using the "-f" argument for parsing files in a directory. Suppose you have some binary data in your file(s) you will then end up with decoding errors when reading the file(s). Ignoring errors lets you move past the binary data and continue processing the remainder of the file(s).