anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.15k stars 567 forks source link

Some Java libraries are not being detected, or being mis-identified #3320

Open dbrugman opened 2 weeks ago

dbrugman commented 2 weeks ago

What happened: When scanning Docker images coming with many Java libraries (*.jar files), I noticed that some were either missing in the resulting SBOM, or were present but with the wrong name.

What you expected to happen: I would expect all Java libraries to get detected and included in the SBOM with the correct names.

Steps to reproduce the issue: Create a Docker image using this Dockerfile:

FROM ubuntu:latest

# These will NOT be detected by Syft 1.14.0:
ADD https://repo1.maven.org/maven2/net/datafaker/datafaker/1.9.0/datafaker-1.9.0.jar /java/
ADD https://repo1.maven.org/maven2/javax/inject/javax.inject/1/javax.inject-1.jar /java/

# This one will be detected but with the WRONG NAME:
ADD https://repo1.maven.org/maven2/com/datastax/oss/java-driver-core-shaded/4.17.0/java-driver-core-shaded-4.17.0.jar /java/

# This one WILL be detected correctly by Syft 1.14.0:
ADD https://repo1.maven.org/maven2/com/google/guava/guava/33.3.1-jre/guava-33.3.1-jre.jar /java/

Create an image:

docker build -t testjava .

Create an SBOM and search for the presence of Java libraries:

syft scan testjava | grep java-archive

Only 2 out of the 4 libraries are detected:

core                 4.17.0                       java-archive
guava                33.3.1-jre                   java-archive

And note that the name of the java-driver-core-shaded library is incorrectly shown as just core.

Anything else we need to know?:

Environment:

wagoodman commented 1 week ago

An initial look shows that:

Thanks for reporting!