Add functionality to extract JS strings as links in a javascript blob

blacklanternsecurity / bbot

A recursive internet scanner for hackers.

https://www.blacklanternsecurity.com/bbot/

GNU General Public License v3.0

4.43k stars 396 forks source link

Add functionality to extract JS strings as links in a javascript blob #1121

Open Sh4d0wHunt3rX opened 6 months ago

Sh4d0wHunt3rX commented 6 months ago

Couldn't get JS strings as links to able to grep

My command: bbot -t trickest.com -m httpx -c web_spider_distance=2 web_spider_depth=3 web_spider_links_per_page=1000 omit_event_types=[] url_extension_httpx_only=[]

🙏

TheTechromancer commented 6 months ago

@liquidsec what do you think about this? We would essentially be implementing js link extractor.

Sh4d0wHunt3rX commented 6 months ago

This is my command:

bbot -t react.dev -m httpx -c web_spider_distance=3 web_spider_depth=3 web_spider_links_per_page=500 omit_event_types=[]

And bbot can't detect any of these JS as links

For example this link not exists in output file: https://react.dev/_next/static/chunks/webpack-8af07453075e2970.js

TheTechromancer commented 6 months ago

Added support for extracting URLs from <link> elements: https://github.com/blacklanternsecurity/bbot/pull/1132.

Sh4d0wHunt3rX commented 6 months ago

I add some more examples here for future testing, I guess all of them are related to JS blob.

openai.com

shopify.com

atlassian.com

whatsapp.com

ahrefs.com

clickup.com

TheTechromancer commented 6 months ago

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

Sh4d0wHunt3rX commented 6 months ago

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

bbot -t https://www.atlassian.com/software -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[]

I have it like this tens of times on the output file, but it's not as "url": "https://atl-global.atlassian.com/js/atl-global.min.js"

TheTechromancer commented 6 months ago

bbot -t https://www.atlassian.com/software -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[]

I think you're forgetting a config option ;)

(The reason this config option exists is because most everyone wants to search javascript files for secrets etc., but if it didn't contain anything interesting, they usually don't want to see it in the output.)

Sh4d0wHunt3rX commented 6 months ago

Thanks 🙏 I also used that config, but still same : (

Sh4d0wHunt3rX commented 6 months ago

This is my command:

bbot -t react.dev -m httpx -c web_spider_distance=3 web_spider_depth=3 web_spider_links_per_page=500 omit_event_types=[]

And bbot can't detect any of these JS as links

For example this link not exists in output file: https://react.dev/_next/static/chunks/webpack-8af07453075e2970.js

For this one, I just upgraded bbot to v1.1.7.2998rc and this JS only exists as URL UNVERIFIED, but shouldn't it exist as URL too?

https://react.dev/_next/static/chunks/webpack-a1ff329830897a9a.js

My command: bbot -t react.dev -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[] url_extension_httpx_only=[]

TheTechromancer commented 6 months ago

@amiremami that specific file is 4 levels deep. The reason it's not showing up is because the spider is set to a depth of 2 (web_spider_depth=2).

If you enable --debug, it will tell you the reason:

2024-02-27 17:00:10,924 [DEBUG] bbot.modules.internal.excavate base.py:1175 Tagging URL_UNVERIFIED("https://react.dev/_next/static/chunks/webpack-ccf89d5e32b01f59.js", module=excavate, tags={'in-scope', 'extension-js', 'endpoint'}) as spider-danger because its spider depth or distance exceeds the scan's limits

Sh4d0wHunt3rX commented 6 months ago

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

Still couldn't get the atlassian neither in URL nor in URL_UNVERIFIED , if this problem is different than JS blob, please check, thanks a lot 🙏

Got this today

TheTechromancer commented 6 months ago

@amiremami keep in mind that https://atl-global.atlassian.com/js/atl-global.min.js is on a different subdomain than www.atlassian.com, so it's not in scope. If you want to see it you will need to either:

1) increase your scope report distance to see the URL_UNVERIFIED (-c scope_report_distance=1) 2) whitelist all of atlassian.com to also produce a URL (-w atlassian.com)