[BUG] Erroneous output to terminal when using --extract-links

Greenwolf commented 3 years ago

Is your feature request related to a problem? Please describe. When using --extract-links, it would be nice to have an option which only grabbed links from the original domain. I'm also not sure if it is starting to dir bust on other domains that are extracted? The output is unclear.

Describe the solution you'd like A flag to limit the scope of the tool would be great. Also additional clarity in the ReadMe on if it starts busting new domains when using the --extract-links option would be great.

P.S. - Absolutely loving the tool! I think you've got a real edge on gobuster & ffuf with this one 👍. I've been sharing will all my colleagues! You've done some really great work on this!

epi052 commented 3 years ago

Hi @Greenwolf,

Thanks for the request and the kind words! I'm really glad you're enjoying it and getting some use out of it.

When using --extract-links, it would be nice to have an option which only grabbed links from the original domain

The current logic is as follows when --extract-links is used:

parse response body
find absolute and relative links
if absolute
- does domain/ip match original target's domain/ip? yes - make request or bust dir, as appropriate : no - skip
if relative
- append the relative path to the current target and make request/bust

I'd love to know if you're seeing requests off the primary target domain, as that's definitely not intended. Can you let me know what you've observed and whether or not the description above meets the intent of this feature request?

Greenwolf commented 3 years ago

Hi @epi052, i ran it on domain A, and it seemed to start making requests on domain B. Am i misreading the output?

I've checked the proxy logs and actually it doesn't seem to be making the request, but it's messing up the console output with all the non in scope items. Is that intentional?

200      27133 https://original.domainA.org/img/X.png
200      14950 https://original.domainA.org/img/Y.png
200       4510 https://original.domainA.org/img/Z.png
ERR    716.988 Error while making request: error sending request for url (http://sub.domainB.org/300x700_X.html/X.php): error trying to connect: dns error: failed to lookup address information: nodename nor servname provided, or not known
[#######>------------] - 11m   148814/373534  207/s   https://original.domainA.org
[>-------------------] - 9m      1932/373534  3/s     http://domainC.com/
[>-------------------] - 3m      3954/373534  21/s    http://sub.domainB.org/
[>-------------------] - 3m      4014/373534  21/s    http://sub.domainB.org/2055.php
[>-------------------] - 3m      3889/373534  20/s    http://sub.domainB.org/IM
[>-------------------] - 3m      3994/373534  21/s    http://sub.domainB.org/info
[>-------------------] - 3m      3966/373534  21/s    http://sub.domainB.org/fixed

epi052 commented 3 years ago

Just to make sure I understand correctly:

When run with --proxy no requests are actually made to any off-target domain, however, console output shows that directories on other domains are being busted.

Do you ever see any of the off-target domain lines in the 'upper' output area, i.e. not just the progress bar? I'm guessing if they're not in the proxy logs, they're not in that output either.

Greenwolf commented 3 years ago

Yes that is correct. But i actually got 1000's of lines of the off-target domain output listed in the console. The command i used was this:

./feroxbuster -u https://original.domainA.org/ --extract-links --depth 2 --wordlist ./content-discovery/content_discovery_all.txt

epi052 commented 3 years ago

Good deal. Definitely sounds like it needs some attention. I'm wrapping up 1.5.0 now and should be able to check this out over the weekend.

You've already narrowed down the possible location of the problem significantly, thank you!

I'm switching this to a bug for now.

epi052 commented 3 years ago

@Greenwolf good morning!

I'm trying to replicate what you're seeing. If you're able, could you confirm that some of the domains you saw requested are included below?

epi052 commented 3 years ago

probably some more

http:assistenza.oliviero.it/ajax  
http:dreambox.de/board            
http:fixelcloud.com               
http:jxshop.ir/json               
http:krasivaya662.jimdo.com/http:krasivaya662.jimdo.com/http:krasivaya662.jimdo.com              
http:localhost                    
http:pad.appbako.com/jikanawari   
http:pad.appbako.com/kaiseki      
http:pad.appbako.com/zatsudan     
http:pegueraeu.tumblr.com         
http:puradsifm.net:9994           
http:stm20.srvstm.com:23110       
http:studiokeya.com               
http:techblog.dahmus.org          
http:thg.ne.jp                    
http:www.domprazdnika.ru          
http:www.grozingerlaw.com         
http:0matome.com
http:0matome.com
http:1000mg.jp
http:1000mg.sblo.jp
http:16bit.blog.jp
http:18mn.blog89.fc2.com
http:2ch.anything-navi.net
http:2ch.logpo.jp
http:2ch-mi.net
http:2ch-mma.com
http:2ch-mma.com
http:2d.news-edge.com
http:acopy.blog55.fc2.com
http:ad-feed.com
http:afo-news.com
http:afo-news.com
http:afo-news.com
http:akb48mato.com
http:akb48m.com
http:aki680.dtiblog.com
http:akunaki2.blog.fc2.com
http:ameblo.jp
http:animalch.net
http:antch.net
http:antenasu.net
http:antennabank.com
http:antennabank.com
http:antenna-ga.com
http:antenow.com
http:aqua2ch.net
http:aresoku.blog42.fc2.com
http:asugaru.blog77.fc2.com
http:avzyoyuumatome.jp
http:axia-hakusan.com
http:besttrendnews.net
http:besttrendnews.net
http:blog-livedorr.com
http:bokuteki.com
http:buhidoh.net
http:carp.nanj-antenna.net
http:chaos2ch.com
http:daimajin.net
http:digi-6.com
http:dividendlife.net
http:dng65.com
http:doujinch.com
http:doumori-app.com
http:douzingame.com
http:dq-antena.com
http:dqmsl-antenna.com
http:dqmsl-dq.antenna-chan.info
http:dqmsl.site
http:ebitsu.net
http:edde.blog75.fc2.com
http:egone.org
http:equal-love.club
http:eroch8.com
http:erodaioh.blog8.fc2.com
http:erohop.dtiblog.com
http:ero-kawa.com
http:ero-kawa.com
http:ero-kawa.com
http:eromanga-kingdom.com
http:eromon.info
http:ero-nuki.net
http:erosnoteiri.com
http:erotube.org
http:esite100.com
http:fc23.blog63.fc2.com
http:fesoku.net
http:gallife.blog89.fc2.com
http:gameblogrank.com
http:gehasoku.com
http:geitsubo.com
http:gookc.blog.fc2.com
http:gorirarara.dtiblog.com
http:hamusoku.com
http:hana.kachoufugetsu.info
http:headline.mtfj.net
http:high-oku.com
http:hilite000.blog.fc2.com
http:hima-game.com
http:h-nijisoku.net
http:hoshi-dq.co
http:ichliebefussball.net
http:idol-blog.com
http:iphonech.info
http:iryujon.blog.fc2.com
http:ituki88.com
http:jin115.com
http:jplol.blog.fc2.com
http:jyouhouya3.net
http:kachimuka-matome.com
http:kaigai-antena.com
http:kankore.44ant.biz
http:kaoru-office.biz
http:karapaia.com
http:katuru.com
http:kaze.kachoufugetsu.info
http:kb24lal.blog9.fc2.com
http:keiba.blog.jp
http:ken-ch.vqpv.biz
http:kijosoku.com
http:kikonboti.com
http:kisslog2.com
http:kizitora.jp
http:kojimedia.me
http:konowaro.net
http:konowaro.net
http:konowaro.net
http:konowaro.net
http:ks4402.blog94.fc2.com
http:kyousoku.net
http:marumie55.com
http:matomenomori.net
http:matometatta-news.net
http:minkch.com
http:minnanonx.com
http:mix2ch.blog.fc2.com
http:moeimg.net
http:moerank.com
http:moero25.blog.fc2.com
http:momo96ch.com
http:mushitori.blog.fc2.com
http:nanjdragons.com
http:nbama.blog.fc2.com
http:nekomemo.com
http:nekowan.com
http:netatama.net
http:news109.com
http:news-choice.net
http:news-choice.net
http:news-choice.net
http:news-choice.net
http:news-choice.net
http:newser.cc
http:newsnow-2ch.com
http:newsnow-2ch.com
http:newsnow-2ch.com
http:newsnow-2ch.com
http:newsoku.jp
http:news-three-stars.net
http:nextneo.blog.fc2.com
http:niconico.boy.jp
http:nikkanerog.com
http:ninshinda.com
http:nmb48matome.jp
http:nocky.blog.fc2.com
http:occugaku.com
http:onesoku.com
http:ooiotakara.com
http:pakan.blog91.fc2.com
http:panpilog.com
http:pazudora-ken.com
http:picosoft.blog.fc2.com
http:pinkomen.blog.fc2.com
http:pretty77.blog9.fc2.com
http:ps3dominater.com
http:railgun-antenna-x.info
http:ranks1.apserver.net
http:rd.app-heaven.net
http:saionji.net
http:sbrmsg.blog.fc2.com
http:sexy4you.dtiblog.com
http:sexytvcap.com
http:shock-tv.com
http:shuuya.blog114.fc2.com
http:sketan.com
http:sociatenna.com
http:sousharu.blog.fc2.com
http:sow.blog.jp
http:taiken.blog24.fc2.com
http:timtmb.com
http:titimark.blog2.fc2.com
http:tokka1147.com
http:tossoku.net
http:toyop.net
http:tuma.dtiblog.com
http:turbo-bee.com
http:uhouho2ch.com
http:uhouho2ch.com
http:uhouho2ch.com
http:usepocket.com
http:vippers.jp
http:wapuwapu.com
http:waranew.net
http:webnew.net
http:webnew.net
http:webnew.net
http:webnew.net
http:webnew.net
http:worldfn.net
http:wtube.blog89.fc2.com
http:www.antena-2ch.net
http:www.appbank.net
http:www.boku-vipper.com
http:www.dousyoko.net
http:www.dql0.com
http:www.elog-ch.net
http:www.erokiwami.com
http:www.eropad.com
http:www.gurum.biz
http:www.hiroiro.com
http:www.mangajunky.net
http:www.matomech.com
http:www.nukistream.com
http:www.pinkape.net
http:www.vsnp.net
http:xn--gdk4cy65r.xyz
http:xxeronetxx.info
http:yonimo.net
http:yunyunyun.net

epi052 commented 3 years ago

Here's the update on this one. The wordlist you used from jhaddix contains entries like i showed above. Normally, a word from the wordlist is joined using reqwest::Url::join. When that function is called using a fully formed url as the 'word', it actually overwrites the base url.

Example:

Url("http://localhost").join("http:yunyunyun.net")
=> Url("http:yunyunyun.net")

So, the urls from the wordlist were the reason those requests were being shown. I tested with and without --extract-links and got the same result both times.

I added logic that issues a warning if a url is found in the wordlist, but it stops processing that word before anything actually happens.

Greenwolf commented 3 years ago

Sounds great, thank you @epi052. Sorry for the late reply, but yes I was seeing: 'http:techblog.dahmus.org'. Thank you for looking at this and for making a great project even better! 😊

epi052 / feroxbuster

[BUG] Erroneous output to terminal when using --extract-links #114