projectdiscovery / subfinder

Fast passive subdomain enumeration tool.
https://projectdiscovery.io
MIT License
10.24k stars 1.27k forks source link

WaybackArchive: malformed HTTP response #128

Closed whoot closed 6 years ago

whoot commented 6 years ago

Hey there,

when i run subfinder with waybackarchive, I get the following error sometimes:

net/http: HTTP/1.x transport connection broken: malformed HTTP response ""

How can we reproduce the issue?

subfinder --sources waybackarchive -d test.de -v

===============================================
-=Subfinder v1.1.3 github.com/subfinder/subfinder
===============================================

Running Source: WaybackArchive
Running enumeration on test.de
39
39
[WAYBACKARCHIVE] www.test.de
[WAYBACKARCHIVE] www2.test.de
[WAYBACKARCHIVE] files.test.de
[WAYBACKARCHIVE] images1.test.de
[WAYBACKARCHIVE] newsletter.test.de
[WAYBACKARCHIVE] www.nl.test.de
[WAYBACKARCHIVE] nl.test.de
[WAYBACKARCHIVE] preview.test.de
[WAYBACKARCHIVE] testwww.test.de
[WAYBACKARCHIVE] testwww2.test.de
[WAYBACKARCHIVE] testwww3.test.de
[WAYBACKARCHIVE] weiterbildungsguide.test.de
waybackarchive: Get http://web.archive.org/cdx/search/cdx?url=*.test.de/*&output=json&fl=original&collapse=urlkey&page=
: net/http: HTTP/1.x transport connection broken: malformed HTTP response "<html>"

Total 12 Unique subdomains found for test.de

[....]
Ice3man543 commented 6 years ago

I'll check out the problem soon.

Ice3man543 commented 6 years ago

Thanks for bringing this to attention.

codingo commented 6 years ago

I've been unable to reproduce this issue. Unsure if this was resolved with a prior request.

Closing issue as this is also unlikely to be present in the /research code base that will soon be a part of master.

whoot commented 6 years ago

Still having this issue from time to time. It seems that the response is just too big when opening the waybackarchive URL without a value for "page=".

Open in browser:

http://web.archive.org/cdx/search/cdx?url=*.test.de/*&output=json&fl=original&collapse=urlkey&page=

-> long loading time. Maybe a timeout is hit when using subfinder? This would at least explain the html error.

but without any problem you can access:

http://web.archive.org/cdx/search/cdx?url=*.test.de/*&output=json&fl=original&collapse=urlkey&page=1