anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.19k stars 571 forks source link

Maven versions still blank in syft output when using specific search context #3207

Closed rvesse closed 3 weeks ago

rvesse commented 1 month ago

What happened:

Trying to use syft to generate a SBOM from a Maven pom.xml still does not correctly detect some dependency versions despite recent improvements from #2769

In particular this seems to be triggered when a dependency is declared with a version in <dependencyManagement> (often in a parent pom.xml) and then declared without a version in a child modules pom.xml where that dependency is actually being consumed

For example using the repository https://github.com/telicent-oss/smart-caches-core

$ syft file:jaxrs-base-server/pom.xml 
 ✔ Indexed file system                                                   jaxrs-base-server
 ✔ Cataloged contents                                                             52adcddfdc0452dbe1fc094084d91218ae001c1ac23159753ef4ed6e16fd5c00
   ├── ✔ Packages                        [20 packages]  
   └── ✔ Executables                     [0 executables]  
NAME                               VERSION          TYPE           
commons-lang3                                       java-archive    
configurator                       0.22.1-SNAPSHOT  java-archive    
jackson-annotations                                 java-archive    
jakarta.inject-api                                  java-archive    
jakarta.servlet-api                                 java-archive    
jersey-bean-validation                              java-archive    
jersey-client                                       java-archive    
jersey-container-grizzly2-servlet                   java-archive    
jersey-hk2                                          java-archive    
jersey-media-json-jackson                           java-archive    
jul-to-slf4j                                        java-archive    
jwt-auth-common                    0.22.1-SNAPSHOT  java-archive    
jwt-servlet-auth-core                               java-archive    
jwt-servlet-auth-jaxrs3                             java-archive    
logback-classic                                     java-archive    
mockito-core                                        java-archive    
observability-core                 0.22.1-SNAPSHOT  java-archive    
rdf-abac-core                                       java-archive    
slf4j-api                                           java-archive    
testng                                              java-archive

What you expected to happen:

All the dependencies should have their versions correctly detected since they are all declared in the <dependencyManagement> section of the top level pom.xml in that repository.

Steps to reproduce the issue:

$ git clone https://github.com/telicent-oss/smart-caches-core.git
$ syft file:jaxrs-base-server/pom.xml

Environment:

wagoodman commented 1 month ago

It looks like the problem is that the command only allows syft access to the jaxrs-base-server/pom.xml file, however, the pinned versions are in the parent pom in the root of the repo. So scanning the root of the repo you'll see the versions are filled in:

$ git clone git@github.com:telicent-oss/smart-caches-core.git
$ syft ./smart-caches-core

NAME                                                                           VERSION          TYPE
Telicent-oss/shared-workflows/.github/workflows/parallel-maven.yml             main             github-action-workflow
airline                                                                        3.0.0            java-archive
cli-core                                                                       0.22.1-SNAPSHOT  java-archive            (+1 duplicate)
cli-probe-server                                                               0.22.1-SNAPSHOT  java-archive
commons-collections4                                                           4.4              java-archive            (+1 duplicate)
commons-lang3                                                                  3.17.0           java-archive            (+3 duplicates)
configurator                                                                   0.22.1-SNAPSHOT  java-archive            (+2 duplicates)
event-source-file                                                              0.22.1-SNAPSHOT  java-archive
event-source-kafka                                                             0.22.1-SNAPSHOT  java-archive            (+5 duplicates)
event-sources-core                                                             0.22.1-SNAPSHOT  java-archive            (+3 duplicates)
jackson-annotations                                                            2.17.2           java-archive
jackson-core                                                                   2.17.2           java-archive
jackson-databind                                                               2.17.2           java-archive
jackson-dataformat-yaml                                                        2.17.2           java-archive
jakarta.inject-api                                                             2.0.1            java-archive
jakarta.servlet-api                                                            6.1.0            java-archive
jaxrs-base-server                                                              0.22.1-SNAPSHOT  java-archive            (+2 duplicates)
jena-arq                                                                       5.1.0            java-archive
jena-rdfpatch                                                                  5.1.0            java-archive
jersey-bean-validation                                                         3.1.8            java-archive
jersey-client                                                                  3.1.8            java-archive
jersey-container-grizzly2-servlet                                              3.1.8            java-archive
jersey-hk2                                                                     3.1.8            java-archive
jersey-media-json-jackson                                                      3.1.8            java-archive
jul-to-slf4j                                                                   2.0.16           java-archive
jwt-auth-common                                                                0.22.1-SNAPSHOT  java-archive
jwt-servlet-auth-aws                                                           0.16.0           java-archive
jwt-servlet-auth-core                                                          0.16.0           java-archive
jwt-servlet-auth-jaxrs3                                                        0.16.0           java-archive
kafka                                                                          1.20.1           java-archive            (+3 duplicates)
kafka-clients                                                                  3.8.0            java-archive
live-reporter                                                                  0.22.1-SNAPSHOT  java-archive
logback-classic                                                                1.5.7            java-archive            (+4 duplicates)
lombok                                                                         1.18.34          java-archive
mockito-core                                                                   5.13.0           java-archive            (+3 duplicates)
observability-core                                                             0.22.1-SNAPSHOT  java-archive            (+3 duplicates)
opentelemetry-api                                                              1.41.0           java-archive
opentelemetry-javaagent                                                        1.33.6           java-archive
opentelemetry-sdk                                                              1.41.0           java-archive            (+1 duplicate)
opentelemetry-semconv                                                          1.27.0-alpha     java-archive
projector-driver                                                               0.22.1-SNAPSHOT  java-archive
projectors-core                                                                0.22.1-SNAPSHOT  java-archive            (+4 duplicates)
rdf-abac-core                                                                  0.71.7           java-archive
slf4j-api                                                                      2.0.16           java-archive            (+5 duplicates)
slf4j-test                                                                     3.0.1            java-archive            (+5 duplicates)
telicent-oss/shared-workflows/.github/workflows/docker-push-to-registries.yml  main             github-action-workflow
telicent-oss/shared-workflows/.github/workflows/jira-sync.yml                  main             github-action-workflow
testng                                                                         7.10.2           java-archive            (+12 duplicates)

If you're interested in just the jaxrs-base-server, then this is probably too many packages...

find ./smart-caches-core/**/pom.xml
./smart-caches-core/cli/cli-core/pom.xml
./smart-caches-core/cli/cli-debug/pom.xml
./smart-caches-core/cli/cli-probe-server/pom.xml
./smart-caches-core/cli/pom.xml
./smart-caches-core/configurator/pom.xml
./smart-caches-core/event-sources/event-source-file/pom.xml
./smart-caches-core/event-sources/event-source-kafka/pom.xml
./smart-caches-core/event-sources/event-sources-core/pom.xml
./smart-caches-core/event-sources/pom.xml
./smart-caches-core/jaxrs-base-server/pom.xml
./smart-caches-core/jwt-auth-common/pom.xml
./smart-caches-core/live-reporter/pom.xml
./smart-caches-core/observability-core/pom.xml
./smart-caches-core/pom.xml
./smart-caches-core/projector-driver/pom.xml
./smart-caches-core/projectors-core/pom.xml

There are two ways to deal with this:

Using the exclude flag:

syft . --exclude ./projector-driver --exclude ./event-sources --exclude ./cli --exclude ./live-reporter --exclude ./projectors-core --exclude ./jwt-auth-common --exclude ./.github --exlcude ./configurator --exclude ./observability-core

This is not ideal, since it's brittle as your repo changes over time... but it does work.

There is probably a way to do this with a small bash script / find command:

$ find ./**/pom.xml -not -path './jaxrs-base-server/**' -not -path ./pom.xml | xargs -I {} dirname {}

./cli/cli-core
./cli/cli-debug
./cli/cli-probe-server
./cli
./configurator
./event-sources/event-source-file
./event-sources/event-source-kafka
./event-sources/event-sources-core
./event-sources
./jwt-auth-common
./live-reporter
./observability-core
./projector-driver
./projectors-core

The this list could be placed into a syfr config for continual reference:

# exclude.yaml
exclude:
- ./cli/cli-core
- ./cli/cli-debug
- ./cli/cli-probe-server
- ./cli
- ./configurator
- ./event-sources/event-source-file
- ./event-sources/event-source-kafka
- ./event-sources/event-sources-core
- ./event-sources
- ./jwt-auth-common
- ./live-reporter
- ./observability-core
- ./projector-driver
- ./projectors-core
$ syft -c exclude.yaml smart-caches-core
 ✔ Indexed file system                                                                                                                                                                                                                                                                            smart-caches-core
 ✔ Cataloged contents                                                                                                                                                                                                                              754116ec341ecb9f73f90ba249a517e3e194c8c55d2b40d6d01de4261988d0a8
   ├── ✔ Packages                        [23 packages]
   └── ✔ Executables                     [0 executables]
[0000]  WARN no explicit name and version provided for directory source, deriving artifact ID from the given path (which is not ideal)
NAME                                                                           VERSION          TYPE
Telicent-oss/shared-workflows/.github/workflows/parallel-maven.yml             main             github-action-workflow
commons-lang3                                                                  3.17.0           java-archive
configurator                                                                   0.22.1-SNAPSHOT  java-archive
jackson-annotations                                                            2.17.2           java-archive
jakarta.inject-api                                                             2.0.1            java-archive
jakarta.servlet-api                                                            6.1.0            java-archive
jersey-bean-validation                                                         3.1.8            java-archive
jersey-client                                                                  3.1.8            java-archive
jersey-container-grizzly2-servlet                                              3.1.8            java-archive
jersey-hk2                                                                     3.1.8            java-archive
jersey-media-json-jackson                                                      3.1.8            java-archive
jul-to-slf4j                                                                   2.0.16           java-archive
jwt-auth-common                                                                0.22.1-SNAPSHOT  java-archive
jwt-servlet-auth-core                                                          0.16.0           java-archive
jwt-servlet-auth-jaxrs3                                                        0.16.0           java-archive
logback-classic                                                                1.5.7            java-archive
mockito-core                                                                   5.13.0           java-archive
observability-core                                                             0.22.1-SNAPSHOT  java-archive
rdf-abac-core                                                                  0.71.7           java-archive
slf4j-api                                                                      2.0.16           java-archive
telicent-oss/shared-workflows/.github/workflows/docker-push-to-registries.yml  main             github-action-workflow
telicent-oss/shared-workflows/.github/workflows/jira-sync.yml                  main             github-action-workflow
testng                                                                         7.10.2           java-archive
rvesse commented 1 month ago

@wagoodman Thanks for the initial analysis, have updated the issue title based on that

kzantow commented 1 month ago

I was writing this response concurrently to @wagoodman's, so apologies if there's duplicate info. As noted, cloning that repo, and running syft on the entire directory (git clone https://github.com/telicent-oss/smart-caches-core, syft smart-caches-core), with no other options, results in the Java libraries have appropriate versions; this is because the parent context is available to Syft. However, scanning one submodule pom file, as you indicated in the issue (syft file:jaxrs-base-server/pom.xml from the smart-caches-core directory), results in a number of missing properties because the context for the parent pom is missing.

One solution is to provide the necessary Maven context to Syft by running a mvn install of the parent, and allowing Syft to use the local .m2 cache with the environment variable SYFT_JAVA_USE_MAVEN_LOCAL_REPOSITORY=true. Example:

$ docker run --rm -it -v $(pwd)/smart-caches-core:/src maven:latest /bin/sh

# cd /src

# mvn install -Dgpg.skip=true -Dmaven.test.skip=true

# ls /root/.m2/repository/io/telicent/smart-caches/parent/0.22.1-SNAPSHOT
maven-metadata-local.xml  parent-0.22.1-SNAPSHOT-cyclonedx.json  parent-0.22.1-SNAPSHOT-cyclonedx.xml  parent-0.22.1-SNAPSHOT.pom  _remote.repositories

# <install syft>

# SYFT_JAVA_USE_MAVEN_LOCAL_REPOSITORY=true  syft file:jaxrs-base-server/pom.xml
 ✔ Indexed file system                                  /src/jaxrs-base-server
 ✔ Cataloged contents              52adcddfdc0452dbe1fc094084d91218ae001c1ac23  
   ├── ✔ Packages                        [20 packages]  
   └── ✔ Executables                     [0 executables]  
NAME                               VERSION          TYPE           
commons-lang3                      3.17.0           java-archive    
configurator                       0.22.1-SNAPSHOT  java-archive    
jackson-annotations                2.17.2           java-archive    
jakarta.inject-api                 2.0.1            java-archive    
jakarta.servlet-api                6.1.0            java-archive    
jersey-bean-validation             3.1.8            java-archive    
jersey-client                      3.1.8            java-archive    
jersey-container-grizzly2-servlet  3.1.8            java-archive    
jersey-hk2                         3.1.8            java-archive    
jersey-media-json-jackson          3.1.8            java-archive    
jul-to-slf4j                       2.0.16           java-archive    
jwt-auth-common                    0.22.1-SNAPSHOT  java-archive    
jwt-servlet-auth-core              0.16.0           java-archive    
jwt-servlet-auth-jaxrs3            0.16.0           java-archive    
logback-classic                    1.5.7            java-archive    
mockito-core                       5.13.0           java-archive    
observability-core                 0.22.1-SNAPSHOT  java-archive    
rdf-abac-core                      0.71.7           java-archive    
slf4j-api                          2.0.16           java-archive    
testng                             7.10.2           java-archive
willmurphyscode commented 3 weeks ago

Hi @rvesse - are you still facing this issue, or did the steps from @kzantow (previous comment to this one) help?

rvesse commented 3 weeks ago

Sorry no I haven't looked at this again, this got pushed down my queue by other stuff

As it turned out we didn't need to use syft at all for our use case as we're already generating CycloneDX BOMs directly from Maven and could just pass those into grype directly