aquasecurity / trivy

Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
https://aquasecurity.github.io/trivy
Apache License 2.0
22.87k stars 2.25k forks source link

Build Java database to eliminate dependency on external APIs #3427

Closed abelsromero closed 1 year ago

abelsromero commented 1 year ago

Related to @DmitriyLewen post https://github.com/aquasecurity/trivy/issues/3421#issuecomment-1381514618 some of the jar versions not being detected are wellknown Java libraries like Spring framework, and they seem to contain the information.

I can see the following in META-INF/MANIFEST.MF

Manifest-Version: 1.0
Implementation-Title: spring-web
Automatic-Module-Name: spring.web
Implementation-Version: 5.3.24
Created-By: 1.8.0_345 (Oracle Corporation)

Running in debug I see many other that seem are not recognized. So what is exactly the expected information Trivy requires?

2023-01-13T09:51:00.326+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-aop-5.3.24.jar"}
2023-01-13T09:51:00.336+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-beans-5.3.24.jar"}
2023-01-13T09:51:00.359+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-boot-2.7.7.jar"}
2023-01-13T09:51:00.370+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-boot-actuator-2.7.7.jar"}
2023-01-13T09:51:00.378+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-boot-actuator-autoconfigure-2.7.7.jar"}
2023-01-13T09:51:00.399+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-boot-jarmode-layertools-2.7.7.jar"}
2023-01-13T09:51:00.399+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-boot-autoconfigure-2.7.7.jar"}
2023-01-13T09:51:00.427+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-context-5.3.24.jar"}
2023-01-13T09:51:00.448+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-core-5.3.24.jar"}
2023-01-13T09:51:00.451+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-expression-5.3.24.jar"}
2023-01-13T09:51:00.451+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-jcl-5.3.24.jar"}
2023-01-13T09:51:00.457+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-security-core-5.7.6.jar"}
2023-01-13T09:51:00.458+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-security-crypto-5.7.6.jar"}
2023-01-13T09:51:00.459+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-security-oauth2-core-5.7.6.jar"}
2023-01-13T09:51:00.460+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-security-oauth2-jose-5.7.6.jar"}
2023-01-13T09:51:00.517+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-web-5.3.24.jar"}
2023-01-13T09:51:00.530+0100    DEBUG   Unable to identify POM in offline mode  {"file": "spring-webflux-5.3.24.jar"}

Could this be used to improve artifact recognition and reduce remote calls?

DmitriyLewen commented 1 year ago

Trivy requires 3 fields: groupID, artifactID and version.

We define these fields from MANIFEST.MF here: https://github.com/aquasecurity/go-dep-parser/blob/9cd0336b884cbc6ac93493f1751a7c2d85ae7d13/pkg/java/jar/parse.go#L399-L446

I can see the following in META-INF/MANIFEST.MF

There is no groupID in this MANIFEST.MF.

abelsromero commented 1 year ago

It seems Implementation-Vendor-Id was deprecated in Java 8 https://docs.oracle.com/javase/8/docs/technotes/guides/jar/jar.html#Main_Attributes and in Java 11 is no longer mentioned. The closest Implementation-Vendor https://docs.oracle.com/en/java/javase/11/docs/specs/jar/jar.html#manifest-specification is more descriptive.

knqyf263 commented 1 year ago

@abelsromero Thanks for the information! I remember even Implementation-Vendor is not standardized well, and it was hard to extract groupID and artifactID. But it's been a while since I looked into it. It is worth giving it another shot. @DmitriyLewen Would you take a look?

DmitriyLewen commented 1 year ago

@abelsromero thanks for info! @knqyf263 of course, i will check this.

abelsromero commented 1 year ago

Spring is not going to be using those and build is done in Gradle, which kind-of also discards the pom.properties.

On the bringht side, I added to my personal backlog to check sboms though, which may be an alternative for buildpack images.

DmitriyLewen commented 1 year ago

I downloaded some of most popular java projects from maven repository and checked out MANIFEST.MF files:

commons-lang3-3.12.0.jar:

Bundle-SymbolicName: org.apache.commons.lang3
Implementation-Vendor: The Apache Software Foundation

jackson-databind-2.14.1.jar:

Bundle-SymbolicName: com.fasterxml.jackson.core.jackson-databind
Implementation-Vendor-Id: com.fasterxml.jackson.core
Implementation-Vendor: FasterXML

junit-4.13.2.jar:

Implementation-Vendor: JUnit
Implementation-Vendor-Id: junit

logback-classic-1.4.5.jar:

Specification-Vendor: QOS.ch
Implementation-Vendor: QOS.ch
Bundle-SymbolicName: ch.qos.logback.classic

kotlin-stdlib-common-1.8.0.jar:

Implementation-Vendor: JetBrains

People seem to use organization name (or something like that), but it doesn't equal GroupID.

We can use Implementation-Vendor in case, when Implementation-Vendor-ID and Bundle-SymbolicName don't exist. @knqyf263 wdyt? does that make sense?

knqyf263 commented 1 year ago

@DmitriyLewen Yes, it sounds good, although it may not help us so much.

abelsromero commented 1 year ago

I wonder if it would be possible to integrate the index (as seen https://github.com/aquasecurity/trivy/issues/3421#issuecomment-1385902897) so users can download it like you can with the trivy-db. Of course, I realize that may not be simple at all.

durcon commented 1 year ago

@DmitriyLewen Yes, it sounds good, although it may not help us so much.

It will not help much, there is no standard entry. And apparently vendors change their custom entries often.

I looked into a newer Spring MANIFEST.ML:

Manifest-Version: 1.0
Implementation-Title: Core starter, including auto-configuration suppo
 rt, logging and YAML
Automatic-Module-Name: spring.boot.starter
Implementation-Version: 2.6.6
Built-By: Spring
Spring-Boot-Jar-Type: dependencies-starter
Build-Jdk-Spec: 1.8

The group ID is in the Automatic-Module-Name entry.

A better way is to cache the search results like Trivy could cache the database, see my comment https://github.com/aquasecurity/trivy/issues/3421#issuecomment-1383836520

DmitriyLewen commented 1 year ago

are you sure it's a GroupID? Like for this project GroupID == org.springframework.boot spring.boot.starter is the artifactID

durcon commented 1 year ago

are you sure it's a GroupID? Like for this project GroupID == org.springframework.boot spring.boot.starter is the artifactID

You are right, sorry. However, it is only an example for changing custom entry names. Also in this MANIFEST.ML there is no artifact ID in the Implementation-Title.

knqyf263 commented 1 year ago

It's out. https://github.com/aquasecurity/trivy/discussions/3518