arvos-dev / arvos

AI- and Risk-based Vulnerability Management for Trustworthy Open Source Adoption (ARVOS)
Apache License 2.0
8 stars 1 forks source link

Pre-filter on affected vulnerabilities to reduce noise #14

Open emilwareus opened 2 years ago

emilwareus commented 2 years ago

Somehow make ARVOS knowledgable of

Questions:

cristiklein commented 2 years ago

Brainstorming. :smile:

For Java, ARVOS could have access to the pom.xml file from the source code. ARVOS would read and parse this file and "deactivate" vulnerable symbols in the database, based on version ranges.

arvoc-poc --only-versions-from-pom=./pom.xml reads and parses the provided file and disabled vulnerable symbols that are outside the version range.

arvos-poc would issue a warning "pom.xml not provided. Version filtering cannot be performed. This will increase the number of false positives."

moule3053 commented 2 years ago

Based on the couple of sample apps I looked at, I couldn't find a one-to-one relationship between the artifact-id and version in the pom.xml file to the symbol names and versions in the vulnerability dataset. In some case, the versioning style used is different. Example from a pom.xml file

{'groupId': 'io.netty', 'artifactId': 'netty-handler', 'version': '4.1.43.Final'}, {'groupId': 'com.jcraft', 'artifactId': 'jzlib', 'version': '1.1.2'}

compared to some instances in the vulnerability dataset

"package_name": "io.netty:netty",
            "package_manager": "maven",
            "version_range": {
                "gt": "~",
                "gte": "~",
                "lt": "4.1.44",
                "lte": "~"
            }
...
"package_name": "io.netty:netty",
            "package_manager": "maven",
            "version_range": {
                "gt": "~",
                "gte": "~",
                "lt": "4.1.59",
                "lte": "~"
            }
...
"updated_at": "2022-02-25 09:47:42.300769",
            "package_name": "io.netty:netty",
            "package_manager": "maven",
            "version_range": {
                "gt": "~",
                "gte": "~",
                "lt": "4.1.60",
                "lte": "~"
            }

I will have to look at this a bit more

emilwareus commented 2 years ago

@ProgHaj

Make sure we put the package names in the data dump. Also, please communicate around version-string-comparison. We have logic for that internally.

ayoubeddafali commented 2 years ago

Here is an attempt for the pre-filtering : https://github.com/arvos-dev/arvos/pull/21

There is still some intricacies when dealing with the legacy versions, for instance : 4.1.43.Final

Here are some examples:

LegacyVersion(4.1.43.Final) > Version(4.1.42) --> return False LegacyVersion(4.1.43.Final) > LegacyVersion(4.1.42.Final) --> return True LegacyVersion(4.1.43.Final) < Version(4.2.0) --> return True

Update : Due to DB changes, the previous PR was closed and opened here #23