google / osv.dev

Open source vulnerability DB and triage service.
https://osv.dev
Apache License 2.0
1.46k stars 177 forks source link

Withdrawing OSV reports is too heavyweight #2177

Open davidben opened 2 months ago

davidben commented 2 months ago

Given that OSV reports from OSS-Fuzz are currently false-positive-laden (see https://github.com/google/osv.dev/issues/2176 and https://github.com/google/oss-fuzz/issues/11925), the withdrawal process needs to be much smoother. Manually making a PR and trying to keep to some YAML spec (I'm still not sure if I got the timestamps right) is too heavyweight.

andrewpollock commented 2 months ago

Hey @davidben

Could you describe what your current user journey looks like, and what an appropriately lightweight one could look like?

davidben commented 2 months ago

What happened here was:

  1. OSS-Fuzz had an MSan regression. Annoying, but it happens.
  2. We got a ton of false reports in OSS-Fuzz.
  3. I spent 30 minutes of my day triaging one of them to confirm it was indeed a false positive.
  4. There is no way in OSS-Fuzz to reassign to maintainers, so we followed our usual process of ignoring false positives.
  5. A week later, the regression is finally fixed and OSS-Fuzz closes everything.
  6. Later, I get a bunch of alerts internally about unfixed vulnerabilities from this "OSV" thing I'd never heard of or registered with. I have to figure out how to close those out as being false, without the robot reopening them again.
  7. I spend about an hour searching and pinging people to figure out where this came from and where to file a bug about things going wrong.
  8. I eventually find the OSV site and see that it indeed is falsely claiming a bunch of vulnerabilities in BoringSSL
  9. I eventually get directed to an appropriate person, after having burned a ton of time, and learn that I need to manually edit some YAML files in a system I'd never heard of.
  10. I fork the GitHub repo and go to edit the files.
  11. Apparently one needs to manually make YAML timestamps and also manually fill in the last modified time. I honestly could not figure out how to do that effectively, which is why the seconds are zero in my timestamps.
  12. I put together https://github.com/google/oss-fuzz-vulns/pull/37 and ping a human to go land it to stop the OSV-induced panic.

As for what would be better, basically every step in this process went wrong.

  1. Better testing in oss-fuzz might have reduced the frequency of MSan regressions: https://github.com/google/oss-fuzz/issues/11941
  2. But we have to assume MSan regressions might come through, so oss-fuzz should have some way for project owners to triage things: https://github.com/google/oss-fuzz/issues/11939, https://github.com/google/oss-fuzz/issues/11925, and https://github.com/google/oss-fuzz/issues/6974
  3. But project owners may be busy and may just not know to do this. After all, for the entire lifetime of the oss-fuzz project, false positives were not only safe to ignore, but ignoring them was the only means of triage. It's OSV that made assumptions on OSS-Fuzz that weren't tenable. When an OSS-Fuzz issue is imported to OSV, there should be a comment on OSS-Fuzz so project owners know that false data was imported.
  4. When OSV gets it wrong, there needs to be a lightweight flow to fix it. This could be fixing the oss-fuzz data and then having it import automatically (and quickly!), or it could be a direct flow. Whatever the flow, it must not involve any of:
    • Pinging an OSV maintainer for human review
    • Forking a GitHub repo to make a PR in an unfamiliar repo
    • Manually writing YAML timestamps
evverx commented 2 months ago

Apparently one needs to manually make YAML timestamps and also manually fill in the last modified time

Timestamps alone make this process quite off-putting: https://github.com/google/osv.dev/issues/857#issuecomment-1354503866. I'm still not sure how to get them right :-)

andrewpollock commented 2 months ago

Brilliant, thanks for all this detail, it's very helpful.

4. When OSV gets it wrong

So based on https://github.com/google/oss-fuzz-vulns/pull/37 being how to correct things, I think it's fair to say that this should read "When OSS-Fuzz gets it wrong"?

Based on your fourth point above, a wee bit of tooling to aid with manipulating that YAML sounds like it would go a long way to addressing the "lightweight" part of things, plus a modicum of documentation?

I would imagine some sort of GitHub Action could allow PRs from users identifiable as affiliated with the project the PR is the subject of to be automatically merged, thus removing the human component...

So, at a high level, perhaps:

I'm still on the fence about whether this issue belongs in the OSV.dev repo or the OSS-Fuzz repo...

evverx commented 2 months ago

When OSS-Fuzz gets it wrong

I wouldn't say OSS-Fuzz gets it wrong. It works as expected in the sense that it reports issues fuzz targets hit (regardless of whether they have anything to do with security or not). I think the wrong part here is that everything gets automatically imported into the OSV database with no vetting.

better documentation better signposting to that documentation from the OSS-Fuzz issues

That would have certainly helped in https://github.com/google/oss-fuzz/issues/7434.

I would imagine some sort of GitHub Action could allow PRs from users identifiable as affiliated with the project the PR is the subject of to be automatically merged, thus removing the human component

It should help but I'm not sure it can be (fully) automated. For example in https://github.com/google/oss-fuzz/pull/11883 the OSS-Fuzz bot (which tries to do that) didn't recognize the maintainers. Other than that some maintainers aren't on GitHub so they can't open PRs. Some maintainers don't have access to Monorail because they don't have gmail accounts and bug reports just get sent to their mailing lists.

I have to admit I don't know how to fix that. It seems to me that one option would be turn off this import by default and let projects opt-in. If they decide to do that it should probably be safe to assume that they read the documentation, know what OSV is and are ready to vet bug reports. Another option would be to separate the OSS-Fuzz feed from OSV where it wouldn't be automatically implied that they are vulnerabilities.