ossf / osv-schema

Open Source Vulnerability schema.
https://ossf.github.io/osv-schema/
Apache License 2.0
176 stars 75 forks source link

Rationale of limit events for git repositories #217

Closed zacchiro closed 7 months ago

zacchiro commented 8 months ago

Hello, the semantics of limit events in the spec is very clear, but the logic behind it, and hence how to use it correctly in the real world, way less so. I haven't found it documented anywhere, so I'm asking here in the hope the answer will be useful to others as well.

Consider the canonical example in the specification, where limit is used instead of fixed in the following git history: git_graph

A security bug is introduced in commit X and resolved in commit Y. How could it be possible that commits D, E, and F (letting cherry picks aside) are not vulnerable? Any change made in those commits might also fix the vulnerability as a side effect, but the most common case I can imagine in the real world is that those commits are vulnerable.

Encouraging the use of limit over fixed for git range optimizes for the least plausible case, making spec implementations return false negatives, which seems undesirable (because dangerous).

For what is worth, this is not marginal in the OSV.dev data: more than half of git ranges there use limit events instead of fixed. Should they be fixed instead (aside from a few marginal cases), and should the spec strongly encourage favoring fixed over limit?

Thanks for this great project!

oliverchang commented 8 months ago

Thanks for the question!

limit came about due to feedback from some of our data providers. For instance, the https://github.com/cloudsecurityalliance/gsd-database folks wanted to issue separate OSV entries for each affected branch of a particular Linux vulnerability.

More generally, I do agree we should encourage more use of fixed. The challenge here comes from the fact that while false negatives are bad, users of vulnerability scanners are already grappling with floods of potential false positives in their reports for various reasons, and we don't want to make that particular problem worse either.

zacchiro commented 8 months ago

limit came about due to feedback from some of our data providers. For instance, the https://github.com/cloudsecurityalliance/gsd-database folks wanted to issue separate OSV entries for each affected branch of a particular Linux vulnerability.

More generally, I do agree we should encourage more use of fixed. The challenge here comes from the fact that while false negatives are bad, users of vulnerability scanners are already grappling with floods of potential false positives in their reports for various reasons, and we don't want to make that particular problem worse either.

I see, thanks for this information. If you have references to the specific cases of gsd-database where vulnerabilities where branch-specific, I'd love to look into them, because they seem really weird to me.

Great that you agree that fixed should be used more (than limit), as the data in the OSV.dev seems to be going in a different direction. If welcome, I can prepare and submit a (small) PR for the spec, with a recommendation that goes in this direction.

oliverchang commented 7 months ago

Thanks again @zacchiro for contributing https://github.com/ossf/osv-schema/pull/221. Would you consider this issue resolved?

zacchiro commented 7 months ago

Yeah, sounds good, thanks. Closing.