Display normal-operation threats in a different way

scp93ch commented 1 year ago

We should make some minor changes to the GUI to present the new recently-added normal-operation threats in a more understandable way:

[x] New icon to be used in threat lists for normal operation threats (in addition to the icons indicating primary or secondary threats).
[ ] Normal operation threats should not be highlighted in red if they are not blocked/mitigated (since normally they should not be).
[ ] Changing titles in some lists to ensure terminological consistency (exact specification to be determined).

scp93ch commented 1 year ago

This was previously issue 1174 in the internal tracker.

scp93ch commented 12 months ago

I think we would also want to be able to filter out the normal op threats from threat lists (e.g. the attack path list).

kenmeacham commented 11 months ago

I have a query about normal operation threats:

New icon to be used in threat lists for normal operation threats (in addition to the icons indicating primary or secondary threats).

After some checking of several system models, from what I can tell, normal operation threats are always primary, i.e. I can't yet find any examples of secondary threats that are normal operations. Is this true? In this case,we could simplify the display, as a normal threat would not need to distinuish also between primary (1) or secondary (2). In this case, we could display (N) as the icon.

What do you think? This is probably one for @mike1813

scp93ch commented 11 months ago

Even if there are normal ops threats that are secondary, it could still be sensible to use the same space in the UI to put an "N". The info that it is a normal-op is the most important thing to know.

mike1813 commented 11 months ago

Even if there are normal ops threats that are secondary, it could still be sensible to use the same space in the UI to put an "N". The info that it is a normal-op is the most important thing to know.

I agree.

Both normal operational and adverse threats may be composed as primary threats (with a possible dependence on some external cause) or secondary threats (depending only on effects of other threats). The primary/secondary distinction is less important than the distinction between expected (normal operational) and unexpected (all other) threats.

scp93ch commented 11 months ago

Now we can filter the threat lists for normal-ops, @kenmeacham and I just looked into what else to change in the displays and it isn't clear.

The current network-domain model of course has a few different normal-ops threat, such as:

A host being "in service" is the normal state that you would expect so ideally you would not put an alarming warning triangle next to that threat or a tick is you have applied the "disable host" control.

Data being unencrypted is also of course its "normal" state, in the absence of having done anything, but this one feels more like an attack graph threat in that it should be alarming if unmanaged and a nice green tick if managed with a data encryption control.

The vulnerability discovery normal-op threat again feels more like the standard attack graph threats.

Basically, we're not sure what to do about the red warning triangle vs tick icon as the normal-ops threats we've looked at don't all feel the same.

Secondly, in the Threat Explorer, it was not obvious what to do:

We should probably indicate in the top part that it is a normal-op threat. Perhaps there is a standard sentence that can be added saying "This 'threat' is modelling the normal operation of the system".

It would be nice to hide the Direct Cause panel entirely if the cause is the DefaultTW, but that would tie the UI to the knowledgebase more closely than we have before.

@mike1813 can you comment please?

mike1813 commented 11 months ago

Wow! Such huge screen shots!

@mike1813 can you comment please?

Two things occurred to me looking at these posts:

there are two different types of 'normal' in a normal-op threat, and
using DefaultTW as a threat cause is seriously unhelpful

See next two comments.

mike1813 commented 11 months ago

Regarding Default TW.

This was needed in non-secondary threats where no external cause is present, which includes many of normal-op threats as well as some triggered threats where the real cause is the presence of certain controls. Sounds silly, but without a cause the threat could not be caused. So I added DefaultTW as a special TWA which is always given a zero TW level, and so provides the 'kick' needed to get the threat started (before applying any CSGs).

I believe this changed when I refactored the Risk Calculator and introduced mixed cause threats. Now, the way threat likelihood is calculated means that the default likelihood is actually the highest level in the likelihood scale.

The only problem is that the threat path algorithm may still require there be a cause for every threat with a likelihood (i.e., threats other than compliance threats or untriggered CSG side effect threats). I will need to check that, but if true it should be an easy fix. Stephen should also check if it would cause problems in the threat path algorithm used by the adaptor. I'll raised a separate issue on that (issue #120).

Once we've fixed the threat path algorithms, or confirmed that they don't need fixing, we can remove Default TW, and any other irritating 'artificial' causes, from domain models.

mike1813 commented 11 months ago

Regarding different types of 'normal'.

The reason normal-op threats were introduced into the domain model was to capture situations where a threat could be addressed by disabling functionality. The CSG contained a disablement control, which also triggered a side effect threat to availability.

The problem was that in some cases, disablement of functionality could be countered by an attacker with sufficient privileges. But in our mathematical model, controls are not affected by threats (that's what makes them different from TWAs). The only solution was to move the disablement CSG to a separate 'normal op' threat represent disablement as a threat (bringing something into service) which undermined a TWA (Out of Service) unless prevented by the disablement controls. The TWA was then used as a cause in 'real' threats that could be prevented by disablement. An attacker switching a function back on was then just a separate threat affecting the Out of Service TWA (but this time a 'real' threat, not a normal-op threat).

So far, so good. But using this approach led to two new issues:

The root cause finding algorithm would in many cases find only these normal-op threats. So if system-modeller predict you have a high risk of (say) loss of confidentiality in some data, and you look for the root causes and it says 'Host 1 enters service'. Useless!
The number of normal-op threats involved in each attack path could be high, leading to very large numbers of indirect causation links, which were already by far the most numerous (up to 50% of system model data was made up of them).

To get around these issues, we introduced the 'normal-op' flag (initially as a kludge based on the threat class name, but later as a new property of threats). This is used to 'break' the threat paths at the first 'real' threat. Normal-op threats are only considered to be indirect causes of normal-op effects (like the 'In Service' behaviour, opposite of 'Out of Service'). This solves both problems, as it means normal-op threats don't need to be linked to all downstream behaviours, and the root causes of adverse effects will be real threats.

So then we (OK, I) got a bit too clever when using this in domain models.

I realised that the 'discovery of vulnerability' threats have the same two issues as normal-op threats. They are defined in terms of CVSS metrics, so any real vulnerability (something with a CVE) will involve several metrics and hence be 'caused' by several of the discovery threats. So when looking for root causes, if the attack starts with a vulnerability exploit, the root causes are a list of not very comprehensible 'discovery' threats. Moreover, since many different vulnerabilities have the same values for at least some of the CVSS metrics, each 'discovery' threat is at the head of multiple threat paths, and will have a huge number of indirect causation links from it to downstream threats and effects.

I realised I could use the same trick as with normal-op threats to work around this. By classifying vulnerability discovery threats as normal-op threats, they disappear from the list of root causes, along with a large number of indirect causation links.

@scp93ch : how would you like to handle this? I think we have three options:

Pretend 'discovery of vulnerability' really is a normal-op? Pros: requires no additional work. Cons: may confuse users. To be honest I think the idea that 'discovery of vulnerability' is a threat may confuse users anyway.
Introduce an extra threat property, something like 'isAdverseOp' which means 'treat this as a normal-op threat for threat path analysis purposes, but keep the warning triangle when it does appear in threat lists'. Pros: not a lot of extra work. Cons: not zero extra work.
Introduce an extra threat property, but use it as an extra break point in threat paths, so we have 'normal-op' paths, 'abnormal-op' paths and 'attack paths', in that order, but with the possibility that some threat paths may involve only one or two of them. Pros: best chance of making system-modeller output make sense to users. Cons: would take significantly more work.

My instinct is to go for the first option, possibly the second, definitely not the last unless we find ourselves with nothing else to do.

kenmeacham commented 11 months ago

I expect @scp93ch will comment in due course. I'm finding this all getting over-complicated, so I'm sure the user will be confused.

I don't quite understand option (3).

Generally we shouldn't avoid extra work, if these threats are to be made clearer, so I'm favouring option (2) for now. As I understand it, normal-ops can be considered either "good" or "bad", so an extra "isAdverseOp" flag would indicate this for the UI. If this flag is true then display the red warning as usual, otherwise the green tick? i.e. the reverse of the usual.

Having this extra flag might suggest yet another filter on this, although this could be overkill..

One other problem is that the Control Strategy colour is similarly also coloured red for something "bad". It is consistent with the warning triangle colour in the threats list at the moment, so we should try and keep it consistent for the CSG, yes?

mike1813 commented 11 months ago

I'm finding this all getting over-complicated, so I'm sure the user will be confused.

The reason this is complicated is because neither system-modeller nor its original client API was never designed to handle these sorts of things in a well-structured way. Too often, we've had to piggy back one feature on another to avoid a lot of restructuring. Then on top of that, we have had to make some compromises for reasons of computational performance.

The principles are fairly simple:

everything we call a 'threat' is a model of something with the potential to increase risks
some threats are unexpected, but some are not
some threats so more harm than good, others vice versa

What this means is that what we call a threat (in our mathematical model and in the software) doesn't always correspond to what ISO 27005 would call a threat. In ISO 27005, threats must be in some sense 'unexpected' (though not necessarily 'unanticipated'), and must cause harm.

What that means is that in an ideal world, our user interfaces (and possibly our APIs) would classify what we model as a threat into different categories based on whether they are generally harmful and/or unexpected. That way, users would see terminology and colours that seemed consistent with what they might expect. As it stands, we need to work quite hard to get even a half-decent alignment.

A normal-op threat is the first step in this direction, covering threats that are not unexpected but may or may not be harmful. We expect them to happen, so it makes no sense for them to be considered as root causes, because they won't be the first unexpected event in an attack path.

Deciding if a threat is on balance harmful or not is much harder, because it depends on the system in which the threat arises. The reason is that while every threat has the potential to increase the likelihood and hence the risk level of a consequence, how much the risk level increases depends on the impact of the consequence, which is system-specific. Blocking the threat will reduce those risks, but because the CSG could have side effects, it may increase the likelihood of other threats. The added risk from doing that depends on impact levels of their consequences which are also system-specific.

It is true that some domain model threat classes are less likely to be harmful than others when found in a typical system, and many of these are also normal-op threats (i.e., threats that are not unexpected). Unfortunately, the converse is not true, some normal-op threats though not unexpected are more often than not harmful.

Hence my suggestion that we might want to add a second flag marking some normal-op threats as 'adverse' (not unexpected but nevertheless usually harmful). Because strictly speaking that is system specific, it may sometimes make no sense, but it should be better than nothing.

If we want to do better than that, we should colour code according to the risk level, which should be easy for threats, not so much for CSGs as system-modeller doesn't assign a risk level to them.

If we want to do better still, colours should be based on how much the risk level changes when the threat is blocked compared to when it is not. If the risk level increases, when the threat is blocked, the threat is on balance beneficial.

scp93ch commented 11 months ago

I'm pretty sure that the latest algorithm in the Python attack graph tool no longer cares (at the point of finding the attack graph) whether threats are normal-ops or adverse. I had to do this because in the Steel Mill example I ended up with a normal-op threat with adverse threats either side.

I think therefore, we should go for @mike1813 option (2): adding the "isAdverseOp" (or "isAdverseNormalOp" perhaps). We shouldn't expect that to happen right away though.

Regarding current practical changes in the UI:

Can I suggest that for the normal-op threats we just hide the warning triangle/tick mark for now until we can reliably put the right icon there?

In the threat explorer, can I suggest adding the note underneath the main title: "This 'threat' is expected to occur in normal operation."

We can live with the "DefaultTW" direct cause display for now, on the basis that down the line this will disappear from the domain model. We'd then need to hide the direct cause section or add an explanation in the case of displaying a threat with no cause.

Regardless of the above, I agree with @kenmeacham suggestion in the call this morning that we could usefully merge in what is already done.

kenmeacham commented 11 months ago

Can I suggest that for the normal-op threats we just hide the warning triangle/tick mark for now until we can reliably put the right icon there?

Threats now appear as follows:

In the threat explorer, can I suggest adding the note underneath the main title: "This 'threat' is expected to occur in normal operation."

For a normal operation threat (only):

We can live with the "DefaultTW" direct cause display for now, on the basis that down the line this will disappear from the domain model. We'd then need to hide the direct cause section or add an explanation in the case of displaying a threat with no cause.

I have left this display as it is, for now.

Regardless of the above, I agree with @kenmeacham suggestion in the call this morning that we could usefully merge in what is already done.

I'll make a pull request based on these changes for now. Issue should remain open afterwards.

kenmeacham commented 11 months ago

Re-opening as only initial work has been merged in so far..

Spyderisk / system-modeller

Display normal-operation threats in a different way #107