defensivedepth commented 2 years ago

Goal

Update the vulnerability processing for RPM packages to reducing the number of potential false positives.

How?

Review the data sets to understand how patch releases are being represented in CPE/CVE databases. This will be a key part in what we do.
If the patch information is available in the datasets, then find candidates (such as the ones mentioned in comments in this ticket) to spot check the approach to parsing the patch version from the release column for the apps.
- The release column in the rpm_packages osquery table seems to report more detailed versioning information compared to that of the version column. See the second comment in this thread for a more detailed report of this.
Research how other linux distros handle the same packages used to spot check RPM to understand whether we can rope other distros into this approach or we need to continue fine tuning.

Rylon commented 2 years ago

We've encountered this same issue for patched packages too, which makes it impossible to see legitimate vulnerabilities. Any suggestions for improvements here are greatly appreciated :)

noahtalerman commented 2 years ago

@defensivedepth's original report from 2021-11-25 is below (moved out of the issue's description)

Per Slack convo (https://osquery.slack.com/archives/C01DXJL16D8/p1637674720157000):

I am testing the vulnerability processing functionality... Currently on FleetDM 4.5.1, not sure if much has changed with 4.6.1 related to these issues.

So the first screencap shows that my prod centos 7 server has 1763 vulnerabilities.

If we look into this further, from the 2nd screencap we can see there is a finding for authconfig 6.2.8. 3rd screencap shows that it was installed with the package authconfig-6.2.8-30.el7.src.rpm

The changelog for that package can be found here: https://centos.pkgs.org/7/centos-x86_64/authconfig-6.2.8-30.el7.x86_64.rpm.html, in which we see the referenced vuln was fixed in package 6.2.8-26 which means that this finding is a false positive.

This is a common occurrence for those 1763 vulnerabilities.

Noah's follow up is below

@defensivedepth thank you for reporting these false positives and including screenshots. This was helpful to reference when discussing a potential solution. Notes from this discussion:

Osquery allows Fleet to capture the release column for RPM packages. Fleet can use the information in this column when mapping to CPEs to reduce false positives.
This above fix would only address false positives for RPM packages. What about other false positive scenarios the Fleet team doesn't yet know about?
- Fleet can make it easy for users to report false positives so that the software to CPE mapping can be improved
- Fleet can allow users to mark software as a "false positive" in the Fleet UI so that users can hide these reported vulnerabilities and have an easier time using Fleet to surface vulnerabilities I do care about. This ability is covered by the following issue: #3152
- The ability to mark a detected vulnerability as a "false positive" is different than dismissing a vulnerability. Ideally, users report false positive to Fleet, like in this issue, so the Fleet team can improve the vulnerability to CPE mapping.

noahtalerman commented 2 years ago

@Rylon thank you for following up in this issue.

Are you also receiving false positives for RPM packages? Or are you seeing false positives for types of package/installed software?

Rylon commented 2 years ago

Hi @noahtalerman yeah we saw it for sure with RPM packages where the version number was the same, but the revision changes, and it shows up as "vulnerable" still. I haven't check with DEB packages yet.

noahtalerman commented 2 years ago

@Rylon thanks for the follow up!

I haven't check with DEB packages yet.

Got it. Please feel free to let us know, in this issue, if you encounter similar false positives with DEB packages.

The Fleet team wants to continuously improve the vulnerability processing feature. Any false positives, or other issues, reported by Fleet users are very helpful in the effort towards a better and more accurate vulnerability processing solution.

defensivedepth commented 2 years ago

@noahtalerman Am I understanding the notes you posted correctly? Is the proposed solution to essentially crowdsource the flagging of FP, and update the definitions based off of that?

noahtalerman commented 2 years ago

Is the proposed solution to essentially crowdsource the flagging of FP, and update the definitions based off of that

@defensivedepth I think this is correct. What are your immediate thoughts/reaction to this?

To clarify, the proposed solution will involve both crowdsourcing the flagging of FP and research conducted by the Fleet team.

Crowdsourcing the flagging of false positives (FP)

Currently, it's difficult for the Fleet team to uncover many FP because the team's production deployment of Fleet is limited to employee workstations.

We think an effective way to improve the osquery-> CPE -> CVE mapping, by reducing FP, will be to request users to reach out when they encounter FPs.

Thus, Fleet would like to make it easy for the user to report these findings so that the vulnerability processing feature can be improved.

Research conducted by the Fleet team

An additional part of the solution will include dedicated time for the Fleet team to research potential causes of FPs. This way, FPs can be predicted prior to users encountering them in Fleet.

Currently, this research is relatively difficult and time consuming because the Fleet team's production deployment of Fleet is limited to employee workstations.

defensivedepth commented 2 years ago

Couple thoughts:

The one prod server that I was using this with generated 1763 vulnerability findings. I am positive that the vast majority of those are FP. It's not feasible for me to research & confirm each of these findings and flag them as a True | False Positive.

There is also the issue of backports:

Backporting has a number of advantages for customers, but it can create confusion when it is not understood. Customers need to be aware that just looking at the version number of a package will not tell them if they are vulnerable or not. For example, stories in the press may include phrases such as "upgrade to Apache httpd 2.0.43 to fix the issue," which only takes into account the upstream version number. This can cause confusion as even after installing updated packages from a vendor, it is not likely customers will have the latest upstream version. They will instead have an older upstream version with backported patches applied.

https://access.redhat.com/security/updates/backporting

The OVAL project might be helpful: https://oval.cisecurity.org/

noahtalerman commented 2 years ago

It's not feasible for me to research & confirm each of these findings and flag them as a True | False Positive.

Ah, understood that this task wouldn't be feasible. I think I may not have worded the below solution in a confusing way:

Fleet can make it easy for users to report false positives so that the software to CPE mapping can be improved

Fleet can allow users to mark software as a "false positive" in the Fleet UI so that users can hide these reported vulnerabilities and have an easier time using Fleet to surface vulnerabilities I do care about. This ability is covered by the following issue: Add ability to dismiss/mark a detected vulnerability #3152

@defensivedepth by "make it easy for users to report false positives" what I meant was "make it easy for users to file issues like Josh's when they come across a false positive in Fleet."

This issue is a great example of future issues we'd like to see filed when a user encounters a false positive.
The hypothesis is that if Fleet makes it easy to file issues like this, then the Fleet team can quickly resolve the source of the false positives. We intend to resolve your reported false positives in an upcoming release of Fleet.
Overtime, as more issues are filed, the vulnerability detection is stronger (less false positives).

The "Fleet can allow users to mark software as a "false positive" in the Fleet UI" is a separate solution.

Instead of waiting for the Fleet team to improve the vulnerability detection mapping, I can, as a user, mark a vulnerability as a false positive.
This way, I can lessen the noise and only see vulnerabilities that I do care about.

noahtalerman commented 2 years ago

@chiiph I'm passing this issue to you. Can you please update the "How?" section in this issue's description by modifying/replacing the proposed solution? Thank you :)

This way, the improvements specified in this issue can be estimated and prioritized in an upcoming release of Fleet.

defensivedepth commented 2 years ago

@noahtalerman Thanks for the clarification.

Looking forward to using this feature in the future!

Rylon commented 2 years ago

Sorry for the delay in my response, but I wanted to echo @defensivedepth's comment here (https://github.com/fleetdm/fleet/issues/3081#issuecomment-983832009) is something we've seen too, even after applying all available security updates via unattended-upgrades, which did seem alarmingly high, perhaps we have the same issue?

chiiph commented 2 years ago

I've estimated this at a 5 for timeboxed research. We need to know more in order to properly break it down / estimate.

zwass commented 2 years ago

Updates from Tomas' research: https://docs.google.com/document/d/1xdoReEwkXAS8Gqh3JBGL0YHk8sjA7CX4qn_izvwDs6w/edit?usp=sharing (should be public viewable)

chiiph commented 2 years ago

We'll implement repo by repo instead of using pkgs.org as discussed in the document. We'll start with CentOS.

The scope of this ticket is to define the overall architecture and implement the first repo. However, given the unknowns that were discussed with the backend team, it's pointed at 13 and timeboxed at 1 week.

lucasmrod commented 2 years ago

1. Data generation

Let's assume we can generate something like a rpm.sqlite (separate or alongside cpe.sqlite) by parsing the CentOS repository. Initially such database would contain a single table with: "fixed CVEs for a given (name, version, release, arch)"

E.g. sample entry:

name=authconfig, version=6.2.8, release=30.el7, arch=x86_64 ==> fixedCVEs={CVE-2017-7488}

2. Fleet Fetching Extra Information for RPM Packages

Sample of a current software entry in the software table:

mysql> select * from software where name like "authconfig%" and source = "rpm_packages";
+------+------------+---------+--------------+-------------------+
| id   | name       | version | source       | bundle_identifier |
+------+------------+---------+--------------+-------------------+
| 3714 | authconfig | 6.2.8   | rpm_packages |                   |
+------+------------+---------+--------------+-------------------+

One option is to update the software table schema with new columns: release, arch and vendor, initially only set when when source is equal to "rpm_packages".

Another less disruptive option is to have a separate table + ingestion for CentOS hosts that stores this extra data for the installed software.

3. Post-processing for RPMs

Post-processing: The following pseudo-code would be executed after all vulnerability processing is finished.

for each software "$soft" from "software" table where $soft.source="rpm_packages" AND $soft.vendor="CentOS":
    $cves := getEntries from "software_cve" table
    if len($cves) == 0 {
        continue
    }
    $fixedCVEs := get fixedCVEs from "rpm.sqlite" for $soft.
    for cve in $cves:
        if cve is in $fixedCVEs:
            remove entry from software_cve table // alternatively set "true" on a new column to software_cve called "fixed".

While we discuss the above proposal, I'll start to tinker with (1), parsing the RPM repository metadata.

chiiph commented 2 years ago

Let's assume we can generate something like a rpm.sqlite (separate or alongside cpe.sqlite) by parsing the CentOS repository. Initially such database would contain a single table with: "fixed CVEs for a given (name, version, release, arch)"

I would say, let's try to get as much as possible inside cpe.sqlite. So that we don't have to change that part at all.

I think the next possible blocker here is resulting size. Could we run a test by look at total apps that have bugs in the CVE data stream overall, assume all of those are linux packages, and all of them have... I don't know, 40 versions per package that fix CVEs, how much would that take stored in sqlite? Not sure if this is a reasonable test, but we should have an idea of how big a worse possible scenario would be.

One option is to update the software table schema with new columns: release, arch and vendor, initially only set when when source is equal to "rpm_packages".

I think something like this is needed, because app1 in macOS is not the same as in Linux, and so on. I wouldn't only treat CentOS specially, I think we need to change everywhere, and then the case of CentOS is just the first one we are checking security patches on.

zwass commented 2 years ago

let's try to get as much as possible inside cpe.sqlite

Agreed. I'd prefer to see us using a single database file as much as possible until we find it gets to a size that becomes intractable.

The strategy of going through "after" CVEs have been identified and unmarking packages that have been resolved makes sense to me. Would we need to run this check on all past detected CVEs on each run of the workflow? If not, I think we need to find a way to run it at least once on all past detected CVEs after this is implemented to clean up the existing false positives.

fleetdm / fleet

Improve vulnerability processing to reduce false positives detected for RPM packages #3081

Goal

How?

Crowdsourcing the flagging of false positives (FP)

Research conducted by the Fleet team

1. Data generation

2. Fleet Fetching Extra Information for RPM Packages

3. Post-processing for RPMs