honoki / bbrf-client

The Bug Bounty Reconnaissance Framework (BBRF) can help you coordinate your reconnaissance workflows across multiple devices
MIT License
613 stars 90 forks source link

[Issue] Inconsistency while using where source #53

Closed pdelteil closed 3 years ago

pdelteil commented 3 years ago

I was testing the use of this syntax to retrieve urls by source:

> bbrf urls -p PROGRAM where source is 'httpx'

It takes some time (more than 1 minute, but it works)

But, if I do the following:

bbrf use PROGRAM

bbrf urls where source is 'httpx' 

It retrieves all urls (from all programs with the source 'httpx')

I'm using v1.1.7 and the latest server update.

honoki commented 3 years ago

Thanks for flagging this. I will look into the inconsistency. For reference, can you let me know how many URLs are in your dataset so I can troubleshoot the slow response time?

pdelteil commented 3 years ago

Thanks for flagging this. I will look into the inconsistency. For reference, can you let me know how many URLs are in your dataset so I can troubleshoot the slow response time?

Hi there @honoki,

Around 350k urls.

honoki commented 3 years ago

I believe the slow response time is likely mostly due to the time it takes to download the 350k URLs from the BBRF server in plain text. CouchDB does not support compression by default, so downloading huge JSON documents (which is the case for 350k URLs) takes a while to download. I have some ideas to improve the BBRF server Docker image with an NGINX reverse proxy that enables compression and see how it fares, but nothing in the pipeline.

Can you verify the time it takes to download the view manually:

time curl $(jq -r .couchdb ~/.bbrf/config.json)'/_design/bbrf/_view/search_tags?key=\["source","httpx"\]' -i -u bbrf:password

If this is already really slow, at least I know there's not much use in looking at the client code. 😅

honoki commented 3 years ago

As for the described inconsistency: the fix appears to be a simple change to https://github.com/honoki/bbrf-client/blob/master/bbrf/bbrf.py#L835 which I've got lined up for 1.1.8

pdelteil commented 3 years ago

I believe the slow response time is likely mostly due to the time it takes to download the 350k URLs from the BBRF server in plain text. CouchDB does not support compression by default, so downloading huge JSON documents (which is the case for 350k URLs) takes a while to download. I have some ideas to improve the BBRF server Docker image with an NGINX reverse proxy that enables compression and see how it fares, but nothing in the pipeline.

Can you verify the time it takes to download the view manually:

time curl $(jq -r .couchdb ~/.bbrf/config.json)'/_design/bbrf/_view/search_tags?key=\["source","httpx"\]' -i -u bbrf:password

If this is already really slow, at least I know there's not much use in looking at the client code. sweat_smile

This is the output of the command above (~326K urls) :

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 22.7M    0 22.7M    0     0  3892k      0 --:--:--  0:00:05 --:--:-- 3883k

real    0m6.034s
user    0m0.315s
sys 0m0.260s
honoki commented 3 years ago

Hi @pdelteil - OK, looks like it will need some work on the client side maybe. For the sake of clarity, can you create a new bug for the slow processing issue? I'm closing this as the inconsistency when using source should be resolved.

pdelteil commented 3 years ago

Hi @pdelteil - OK, looks like it will need some work on the client side maybe. For the sake of clarity, can you create a new bug for the slow processing issue? I'm closing this as the inconsistency when using source should be resolved.

Sure, I will.

Thanks a lot!