blacklanternsecurity / bbot

A recursive internet scanner for hackers.
https://www.blacklanternsecurity.com/bbot/
GNU General Public License v3.0
4.19k stars 381 forks source link

httpx save responses #864

Closed xorond closed 6 months ago

xorond commented 8 months ago

Description Which feature would you like to see added to BBOT? What are its use cases?

would be really nice if there was a configuration option for httpx module to allow saving HTTP responses in the scan folder. httpx supports this with the -sr flag, this could be a simple true/false config option for bbot.

TheTechromancer commented 8 months ago

This is already possible by adding the following config option:

bbot -c omit_event_types=[]

By default, BBOT omits https responses and unvisited URLs from the output. These are still distributed to modules, but you can tweak the omit_event_types config option to include them in the output as well.

xorond commented 8 months ago

Thank you for the help. Though I would like to save the HTTP responses into separate files as httpx can do when passing the -sr flag. It seems like this will only show the response in the output, am I correct?

TheTechromancer commented 8 months ago

It will also save it in the TXT, JSON, and CSV files in the scan output folder. We preserve HTTPX's JSON format.

xorond commented 8 months ago

httpx will normally make separate files for each page it visits. is this also the case?

xorond commented 8 months ago

what I imagine is something like the gowitness module which saves the screenshots on disk, creating a separate file for each page. httpx supports similar option which would be useful to include as well

TheTechromancer commented 8 months ago

The HTTP responses are saved the same way as all other events. Pulling them out individually is easy and you have a few options for this:

  1. Parse/grep the JSON/TXT/CSV:
    bbot -ys -t example.com -m httpx -c omit_event_types=[] -om json | jq
    # or 
    cat ~/.bbot/scans/demonic_jimmy/output.ndjson | jq
  2. Use the Python API (this is the best option if you are already calling BBOT from a python script):
    
    from bbot.scanner import Scanner

scan = Scanner("example.com", modules=["httpx"], config={"omit_event_types": []}) for event in scan.start(): if event.type == "HTTP_RESPONSE": print(event) # do stuff here

3. Finally, you could [write a BBOT module](https://www.blacklanternsecurity.com/bbot/contribution/#creating-a-module). This is a good option if you are extracting data from the responses that would be useful to other modules. (emails, URLs, and DNS names are already extracted by `excavate`). 
```python
from bbot.modules.base import BaseModule

# put this in `bbot/modules/mymodule.py` wherever you installed bbot
# then you can run with `bbot -m mymodule`

class MyModule(BaseModule):
    watched_events = ["HTTP_RESPONSE"]

    async def handle_event(self, event):
        # do stuff here
        self.hugesuccess(event.data)
xorond commented 8 months ago

thank you for the detailed response. but wouldn't it be simpler to just save the output from httpx? there's also a flag -srd to specify which directory the response output should be saved, it's very handy option.

TheTechromancer commented 8 months ago

I see your point. I definitely wouldn't be opposed to it. It would be a pretty easy tweak to the httpx module. Would you care to open a PR?

TheTechromancer commented 6 months ago

@xorond this has been added in https://github.com/blacklanternsecurity/bbot/pull/1015 and should be merged into dev soon.