SecurityRiskAdvisors / VECTR

VECTR is a tool that facilitates tracking of your red and blue team testing activities to measure detection and prevention capabilities across different attack scenarios
1.35k stars 159 forks source link

Add a sophistication level to testCases #187

Open martindube opened 2 years ago

martindube commented 2 years ago

Good day SRA!

I would like to share a feature request in Vectr, based on a recent idea that we've started using outside of vectr.

Context: Vectr is a great tool for priorization and vulgarizing our security posture against threats. As a manager, I use the results of the purple team exercices to help the organization answer "what is our next best move?". For instance, should we buy a new control? should we engineer detections? are we good enough in a specific field? It is also great to create views per defense layers

Challenge: Because we work in a very technical field, results are hard to vulgarize. When Red Teamers aren't seen as magicians, they are seen as paranoids :) I've been looking for a way to show that not all attack procedure require 0-days or developer skills.

Introducing: sophistication levels. It could be interesting to add an optional field for testCases which would be a number between 1 and 7, based on the STIX 2.1 - Threat Actor Sophistication table.

The real benefits comes when you combine gaps with sophistication level. Let me give an example.

Example: Say we take T1003.001, "OS Credential Dumping - LSASS Memory". At the moment, there must be 20 attack procedure for this specific technique. In a purple team exercise, we might test all of them and figure out that 10 are blocked, 3 are detected and 7 not detected (logged or not logged). Say we accept the risk of the 10 blocked and 3 detected procedures. Of the 7 not detected, where should we start? All procedure have the same impact, providing credentials to an attacker, so from a risk standpoint, only likelyhood remains for priorization. We we could start with the ones with lowest level of sophisticated required to exploit the procedure. For example, the ones that required development and research time could be less a priority than the ones that can be used by simply downloading a tool from github. As of now, there is no easy way to see this nuance in Vectr.

Of course, it make sense only if we agree that most organization needs to defend against script kiddies before State Actors. Looking forward for counter examples. :)

Fun fact: Level 1 procedure are usually harder to detect than Level 2-3 procedures. Going back to the example, you can dump LSASS by using the Task Manager and right clicking on lsass.exe with admin rights (not sophisticated). This used to be harder to detect than SharpKatz (a bit more sophisticated). Having such field can also help vulgarize that Threat Actors are rarely magicians.

Finally, it is worth sharing that sophistication is something that change overtime. As tools are developped, a procedure become more accessible (ex. Mimikatz has simplified a complexe attack). On the other hand, as defense layers evolve, Threat Actors should require more sophisticated arsenals to succeed. I don't know yet how we could manage that in Vectr.

Short term, I know we can use tags but I thought that the idea was worth sharing.

thebleucheese commented 2 years ago

Thanks for the organized feedback! Great thoughts here.

There's a cog in the Red Team Details header on the Test Case panel that takes you to some additional Test Case fields. We have Attack Complexity (Low, Medium, High) in there and used to have a 'Likelihood' field as well. I don't think we can currently report on this data, but we can add that as a feature request. Do you think it needs more granularity than low/medium/high?

image

On the detection side, we added more outcomes recently (Logged, Centrally Logged, Telemetry Only). Typically, we recommend using the more granular log distinctions to guide your Purple Team remediation. If something is Centrally Logged, you can more easily do detection engineering on it to write rules and alerts. Alternatively, if something is Telemetry Only, maybe your EDR or some other tool isn't shipping those logs to your log aggregation platform. Those may be the low hanging fruit vs tests that are Not Detected which could require new log source capture.

This isn't the most informative for level of effort on detection engineering / remediation if you're in a scenario like you're dealing with a lot of noise in the environment due to a false-positive-prone common detection practice. That's tough to capture in a test case in a general way, however, because it could be environment or tool-specific. If you have any thoughts on additional information that would help with that we'd love to hear it. We typically tag level-of-effort for remediation after or during an exercise to prioritize.

martindube commented 2 years ago

Thanks for the quick feedback!

Do you think it needs more granularity than low/medium/high?

I think that it is a good start. There is also this calculator which has 8 metrics just for likelihood: https://www.owasp-risk-rating.com/ but it's maybe too much.

Short term, we will use tags and metadata to experiments new fields that maximise purple team work. I'll let you know of new ideas.

Btw the graphql API is great!