Policy Duplication - Githubissues

BlythMeister commented 1 year ago

Current Behavior

Package matches on a "All" condition Policy Example

This reports 3 policy failures per package

Steps to Reproduce

1.Create policy which has multiple conditions 2.Have coponent that could match multiple 3.See multiple

Expected Behavior

Only 1 policy violation per policy

Dependency-Track Version

4.6.2

Dependency-Track Distribution

Container Image

Database Server

Microsoft SQL Server

Database Server Version

No response

Browser

N/A

Checklist

[X] I have read and understand the contributing guidelines
[X] I have checked the existing issues for whether this defect was already reported

mykter commented 1 year ago

I see this behaviour too. As I understand it, the fix for #2212 in #2333 means that ALL policies match correctly, but they still aren't reported correctly.

If you have a policy such as the one above, it will only trigger for an unlicensed component (excluding '15below/fifteenbelow'), but the UI (and underlying API /v1/violation/project) will report each separate condition as a violation. This would be appropriate for an ANY policy, but isn't correct for an ALL - the policy violation only happens when all of the conditions are met, so reporting that each individual condition match is a violation is false.

It happens here in PolicyEngine.java, where the behaviour is the same whether it is an ALL match or an ANY match. I think in an ALL match scenario, only a single policy violation should be added. Unfortunately the model doesn't lend itself to this - a PolicyViolation has a single type (operational/license/security) and a single condition associated with it.

fabian-zeindl-oebb commented 10 months ago

This issue should be mentioned in the README as it makes the "ALL" operator effectively useless and misleading.

Is there any other way to combined a policy check only for non-internal components?

fabian-zeindl-oebb commented 10 months ago

In my eyes the main issue here is that each policy violation has a specific type (LICENSE, OPERATIONAL or SECURITY) while the ALL operator is mainly used to produce a single violation of a single type like a license-risk filtered by componentor a security-risk filtered by component.

fupgang commented 3 weeks ago

Using the current version 4.11.7 this is still an issue. The analysis of @mykter shows the heart of the problem:

Unfortunately the model doesn't lend itself to this - a PolicyViolation has a single type (operational/license/security) and a single condition associated with it.

I would provide a PR but need some guidance which direction is preferred. I can think of the following alternatives:

a) Refactor the domain model

A PolicyViolation refers to a violated Policy. For ANY or ALL exactly one PolicyViolation is created. Additionally the PolicyViolation refers to a list of PolicyConditionViolations that refer to the respective triggered PolicyCondition. For ALL this list contains all PolicyConditions. For ANY the list contains at least one. The PolicyConditionViolation refers to the defective Component, the PolicyViolation does not anymore. The PolicyViolation still refers to the Project.

The type, for example OPERATIONAL, is currently derived from a PolicyCondition`s subject. If all PolicyConditions investigate the same subject the type is probably clear. It's not in other cases. So it's probably best to let the Policy's author decide and store it in a new field in the Policy. Since the PolicyViolation knows its Policy it does not need the type field anymore.

(The class PolicyConditionViolation already exists in the package policy but not in in the model, thus no corresponding database table yet exists.) This is of course a major change.

b) Emit a single policy violations (without changing the domain model)

Even when multiple PolicyConditions are triggered (for example when using the ALL operator) a single PolicyViolation is created. This is a change in the implementation of the PolicyEngine.

The PolicyViolation refers to a single PolicyCondition only. So this will always result in some information loss. It's hardly possible to always determine the intended violation's type. So this will be vague.

I strongly prefer a refactoring of the domain model. Any feedback (especially form the core maintainers) is welcome.

fabian-zeindl-oebb commented 3 weeks ago

I prefer variant a). Refactor the domain-model, the current implementation is not intuitive at all and prevents many use-cases from being implemented.

nscuro commented 3 weeks ago

I agree that refactoring the model would make the most sense long-term. I'd also love the refactoring to consider #2130 (cc @msymons).

Slightly related: In Hyades we implemented expression conditions, for which it's obviously impossible to pre-define a violation type. What we ended up doing to work around this, is to make violationType a field in PolicyCondition. For subject EXPRESSION, users would specify a custom type, for all other subjects there is a pre-defined mapping.

Still, I fully agree that there should always be at most one violation, but there still needs to be a mechanism to inform the user which condition matched, in particular for the policy operator ANY. We have TODO for this in the Hyades' engine implementation, during policy operator evaluation.

fupgang commented 3 weeks ago

The users can mix and match arbitrary policy conditions. Not all combinations might be useful. (When thinking of useful examples I often conclude they are best implemented with expression conditions.) But as soon as a policy is made up of conditions that map to different violation types, what is the resulting policy violation's type?

Perhaps the type should be hoisted to the policy (instead of policy condition). As long as there is a single condition or all conditions map to the same type, the UI could preset the type.

fabian-zeindl-oebb commented 3 weeks ago

A resulting policy violation type could be "MIXED". or "MULTIPLE"

nscuro commented 3 weeks ago

Perhaps the type should be hoisted to the policy (instead of policy condition). As long as there is a single condition or all conditions map to the same type, the UI could preset the type.

Yes, ideally the violation type would be on policy level. What we did in Hyades was just enough to allow us to proceed with the expression work, we didn't touch policy semantics yet.

In the end, you might have a policy with these conditions:

Component coordinates match: group: com.acme, name: *, version: *
- Current violation type: Operational
License group is not Copyleft
- Current violation type: License

Really the violation type should be License for the entire policy. Operational doesn't make any sense since that condition was merely used for filtering. I'm not sold trying to "auto-detect" the violation type is a worthwhile exercise. I think users should explicitly define the violation type upon policy creation.

fupgang commented 3 weeks ago

What I'm currently trying to refactor to is this:

grafik

Each policy has a type (e.g. operational) and a scope (component or project). These are all user defined, when creating the policy. The policy conditions must match the scope. For example, a condition with subject "coordinates" is evaluated with a component, thus applicable to a component scoped policy only.

The PolicyEvaluator interface defines a method List<PolicyCondition> evaluate(final Policy policy, final T target) with T extends PolicyEvaluationTarget. For example, the CoordinatesPolicyEvaluator investigates a Component (T). This is the case for all currently provided evaluators. For an expression evaluator T would be PolicyEvaluationTarget.

The returned conditions are the triggered ones and stored in the policy violation. So the information can be provided to the user later on. A single policy violation is created by the policy engine when evaluating a policy with a target and ANY or ALL conditions are triggered.

As this is a far reaching change, my workspace is not yet green. But I'm interested in feedback before putting more work on this.

What do you think?

fupgang commented 3 weeks ago

There is another issue that is not yet solved by the above refactoring. This problem holds for the current implementation, too. Is there a GitHub issue describing this already?

A policy violation refers to the current policy and current policy conditions. Whenever a policy or conditions are changed, this breaks the violation. The violation is not reevaluated. But to the user it looks like the updated policy is violated. The updated conditions are shown in the UI.

One could delete all violations whenever a policy or its conditions change. But this might discard valuable information, e.g. the analysis.

Instead, the policy violation should refer the policy and its conditions at the time of evaluation. When updating a policy, the user could decide to delete existing violations or not.

This could be implemented by storing a history of policies and conditions and let a violation refer to a specific version. Or the violation could store (perhaps embed) some snapshot of the information. The later could be part of this refactoring.

nscuro commented 3 weeks ago

What I'm currently trying to refactor to is this:

I think that makes sense.

The PolicyEvaluator interface defines a method List evaluate(final Policy policy, final T target) with T extends PolicyEvaluationTarget. For example, the CoordinatesPolicyEvaluator investigates a Component (T). This is the case for all currently provided evaluators. For an expression evaluator T would be PolicyEvaluationTarget.

For an implementation in v4.x, aligning the PolicyEvaluators roughly with the existing implementation is fine.

For expressions specifically, PolicyEvaluators won't be a thing anymore. Instead, they will be "emulated" through expressions. What this means in practice is that conditions are converted to expressions (see here, here, and here).

After that point, the engine doesn't have to care whether a condition was a pre-defined one (e.g. COORDINATES), or an expression to begin with. Which then allows for various efficiency optimizations, such as only loading fields that are actually accessed, and only doing so once, instead of over and over again for each condition.

A policy violation refers to the current policy and current policy conditions. Whenever a policy or conditions are changed, this breaks the violation. The violation is not reevaluated. But to the user it looks like the updated policy is violated. The updated conditions are shown in the UI.

Re-evaluation would happen in any of these cases:

Upon next SBOM upload
Upon manual reanalysis of the project
At least 1-2x a day during recurring analysis

This can result in a significant delay between the policy being modified, and users seeing that modification take effect.

We could add a mechanism that triggers re-evaluation for all affected projects whenever a policy is changed. However, that can be a quite expensive thing to do, in particular for large portfolios.

This could be implemented by storing a history of policies and conditions and let a violation refer to a specific version. Or the violation could store (perhaps embed) some snapshot of the information.

Intrigued by the history idea. But wondering how that might help in practice. Also considering that users will not only interact with this information via UI, but also via REST API directly.

If I'm fetching violations for a project, how will I know they refer to an old version of the policy? And how does that help me deal with the violations I'm receiving?

Not a fan of embedding policy information since it will amplify table size.

nscuro commented 3 weeks ago

Thanks for looking into this btw @fupgang, really appreciate it!

fupgang commented 2 weeks ago

This can result in a significant delay between the policy being modified, and users seeing that modification take effect.

Thanks, this was a helpful hint. The stale information in the UI made it harder for me to understand the way dependency track handles policy violations. Especially how the violations of an evaluation are reconciled with the persisted violations and their analyses form previous evaluations. I incorrectly assumed the stale information in the UI are permanently and therefore some form of history was needed.

fupgang commented 2 weeks ago

Still working on this one, but want to share some design notes:

Reconcile policy violations

After evaluation the policy violations are reconciled with existing persisted policy violations. New ones are created, omitted ones are deleted. But when are violations considered equal? My suggestion is:

private boolean testViolationsEqual(PolicyViolation v1, PolicyViolation v2) {
    return v1.getPolicy().getId() == v2.getPolicy().getId()
            && v1.getPolicy().getScope() == v2.getPolicy().getScope()
            && v1.getTarget().getId() == v2.getTarget().getId();
}

This does not consider the policy conditions anymore. A policy maybe violated once for any evaluated target (component or project). Whenever the policy's conditions are changed, the policy violation maybe recognized by different means, but it's still the same defect. So the analysis still holds.

(One should never turn a policy or its conditions upside down, so it states something completely different, but create a new one instead.)

However, the violation's triggered conditions should be updated if changed. For example, after adding a condition which triggers in the next evaluation, the violation should show this, too.

Notifications after editing policy conditions

There is a test case PolicyEngineTest.notificationTest that creates a policy with a single condition and expects a notification about the violation. Then another condition is added to the policy which is triggered in the later evaluation, too. So the policy violation is triggered by both conditions now. I think there should be no further notification about the violation in this case. This is in line with the upper section.

Deleting policy conditions

When deleting a policy condition from a policy, the reference to that condition is removed for all violations triggered by that condition. If this was the single condition that triggered the violation, the violation and its analysis are deleted, too. Otherwise, the violation and its analysis remain. The violation's triggered conditions will be updated by the next evaluation run.

DependencyTrack / dependency-track

Policy Duplication #2146

Current Behavior

Steps to Reproduce

Expected Behavior

Dependency-Track Version

Dependency-Track Distribution

Database Server

Database Server Version

Browser

Checklist

Reconcile policy violations

Notifications after editing policy conditions

Deleting policy conditions