blacklanternsecurity / bbot

A recursive internet scanner for hackers.
https://www.blacklanternsecurity.com/bbot/
GNU General Public License v3.0
4k stars 366 forks source link

BBOT 2.0 URL Excavation TODOs #1503

Open TheTechromancer opened 5 days ago

TheTechromancer commented 5 days ago

The following are TODOs for our URL excavation:

liquidsec commented 4 days ago

"Tests to make sure we're excavating query parameters"

This exists, there are a number of tests with the prefix TestExcavateParameterExtraction that cover this.

liquidsec commented 4 days ago

As far as the first point, we've discussed this some offline, but i'll summarize a few points for consideration:

  1. There is very little overlap, really only one YARA rule that crosses over between two. This is because most parameters are extracted in a way that doesn't touch the actual URL at all, for example in forms, in jquery calls, etc.
  2. Parameter extraction has a lot more complexity, and also isn't on by default. This enables us to skip this complexity when we aren't doing any thing with WEB_PARAMETER.
  3. It is extremely likely we'd actually add overall complexity by trying to merge the functionality. (As simple as possible URL extraction + As simple as possible Parameter extraction) < Very Complex Combined Extraction
  4. The YARA rules are all compiled. This means the additional overhead by adding one YARA rule is very small, even if it is doing a very similar thing in one or two cases. The process of compilation minimizes this overhead.
  5. Clear logical separation. Since URLs go to completely different event types than parameters, and have very different rules, separating their post-processing logic will make everything significantly more maintainable.
  6. Slowed URL processing. URLs are handled more frequently, and adding parameter logic there means every URL extraction is going to take longer.