anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.18k stars 570 forks source link

empty name #3194

Closed idefixcert closed 2 weeks ago

idefixcert commented 1 month ago

What happened: Some of the components I get on a system have an empty name like:

   {
      "bom-ref": "5c2ce977a3f2f724",
      "type": "library",
      "name": "",
      "version": "1.8",
      "licenses": [
        {
          "license": {
            "name": "GPL"
          }
        }
      ],
      "purl": "pkg:generic/@1.8",
      "properties": [
        {
          "name": "syft:package:foundBy",
          "value": "linux-kernel-cataloger"
        },

I looked into the code and saw that there is a IsValid function for packages (https://github.com/anchore/syft/blob/1aaa6440073db6b90673e4303c6ef5d359052f7e/syft/pkg/package.go#L83-L85). but not all of the cataloger do respect that.

What you expected to happen:

I would expect that components (packages) that are not valid would not get exported.

Steps to reproduce the issue:

I ran that on a local filesystem.

Anything else we need to know?:

NO

Environment:

in my case the following patch helped:

Index: syft/pkg/cataloger/ruby/parse_gemspec.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/syft/pkg/cataloger/ruby/parse_gemspec.go b/syft/pkg/cataloger/ruby/parse_gemspec.go
--- a/syft/pkg/cataloger/ruby/parse_gemspec.go  (revision 7c96a10cbea82e94c843112c8394abac7672b0dc)
+++ b/syft/pkg/cataloger/ruby/parse_gemspec.go  (date 1725491039246)
@@ -102,13 +102,13 @@
            return nil, nil, fmt.Errorf("unable to decode gem metadata: %w", err)
        }

-       pkgs = append(
-           pkgs,
-           newGemspecPackage(
-               metadata,
-               reader.Location,
-           ),
+       p := newGemspecPackage(
+           metadata,
+           reader.Location,
        )
+       if pkg.IsValid(&p) {
+           pkgs = append(pkgs, p)
+       }
    }

    return pkgs, nil, nil
Index: syft/pkg/cataloger/kernel/parse_linux_kernel_module_file.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/syft/pkg/cataloger/kernel/parse_linux_kernel_module_file.go b/syft/pkg/cataloger/kernel/parse_linux_kernel_module_file.go
--- a/syft/pkg/cataloger/kernel/parse_linux_kernel_module_file.go   (revision 7c96a10cbea82e94c843112c8394abac7672b0dc)
+++ b/syft/pkg/cataloger/kernel/parse_linux_kernel_module_file.go   (date 1725490779123)
@@ -30,12 +30,14 @@

    metadata.Path = reader.Location.RealPath

-   return []pkg.Package{
-       newLinuxKernelModulePackage(
-           *metadata,
-           reader.Location,
-       ),
-   }, nil, nil
+   p := newLinuxKernelModulePackage(
+       *metadata,
+       reader.Location,
+   )
+   if pkg.IsValid(&p) {
+       return []pkg.Package{p}, nil, nil
+   }
+   return []pkg.Package{}, nil, nil
 }

 func parseLinuxKernelModuleMetadata(r unionreader.UnionReader) (p *pkg.LinuxKernelModule, err error) {
Index: syft/pkg/cataloger/kernel/parse_linux_kernel_file.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/syft/pkg/cataloger/kernel/parse_linux_kernel_file.go b/syft/pkg/cataloger/kernel/parse_linux_kernel_file.go
--- a/syft/pkg/cataloger/kernel/parse_linux_kernel_file.go  (revision 7c96a10cbea82e94c843112c8394abac7672b0dc)
+++ b/syft/pkg/cataloger/kernel/parse_linux_kernel_file.go  (date 1725490728661)
@@ -35,12 +35,14 @@
        return nil, nil, nil
    }

-   return []pkg.Package{
-       newLinuxKernelPackage(
-           metadata,
-           reader.Location,
-       ),
-   }, nil, nil
+   p := newLinuxKernelPackage(
+       metadata,
+       reader.Location,
+   )
+   if pkg.IsValid(&p) {
+       return []pkg.Package{p}, nil, nil
+   }
+   return []pkg.Package{}, nil, nil
 }

 func parseLinuxKernelMetadata(magicType []string) (p pkg.LinuxKernel) {
Index: syft/pkg/cataloger/ruby/parse_gemfile_lock.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/syft/pkg/cataloger/ruby/parse_gemfile_lock.go b/syft/pkg/cataloger/ruby/parse_gemfile_lock.go
--- a/syft/pkg/cataloger/ruby/parse_gemfile_lock.go (revision 7c96a10cbea82e94c843112c8394abac7672b0dc)
+++ b/syft/pkg/cataloger/ruby/parse_gemfile_lock.go (date 1725490344297)
@@ -42,13 +42,14 @@
            if len(candidate) != 2 {
                continue
            }
-           pkgs = append(pkgs,
-               newGemfileLockPackage(
-                   candidate[0],
-                   strings.Trim(candidate[1], "()"),
-                   reader.Location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation),
-               ),
+           p := newGemfileLockPackage(
+               candidate[0],
+               strings.Trim(candidate[1], "()"),
+               reader.Location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation),
            )
+           if pkg.IsValid(&p) {
+               pkgs = append(pkgs, p)
+           }
        }
    }
    if err := scanner.Err(); err != nil {
idefixcert commented 1 month ago

I opened an pull request for it: https://github.com/anchore/syft/pull/3199

willmurphyscode commented 1 month ago

@idefixcert thanks for the issue and the PR!

We still have a couple questions before understanding the issue and reviewing the PR:

  1. Is there a publicly available artifact that exhibits this problem? We'd like to understand how Syft makes a package that has no name - it could be that the bug is further upstream, and we need to improve the code where Syft tries to detect the name, rather than drop the malformed package before it's returned by the cataloger.
  2. Are you running Syft with default config?

The code I think might need to be fixed is https://github.com/anchore/syft/blob/fcd5ec951de6b3fc1f1aa2a36968356d2eb22170/syft/pkg/cataloger/kernel/parse_linux_kernel_module_file.go#L124-L125

Are you able to see what's going on there? Is it possible the kernel module specifies its name in a different field or something?

rhartman93 commented 1 month ago

I'm not sure if i'm in the exact same boat, but I was inspecting an SBOM for an image I built that was constructed with syft, and I have several instances of this for rubygems.

        {
            "bom-ref": "4dabbdca5e182531",
            "type": "library",
            "name": "",
            "purl": "pkg:gem/",
            "properties": [
                {
                    "name": "syft:package:foundBy",
                    "value": "ruby-gemspec-cataloger"
                },
                {
                    "name": "syft:package:language",
                    "value": "ruby"
                },
                {
                    "name": "syft:package:type",
                    "value": "gem"
                },
                {
                    "name": "syft:package:metadataType",
                    "value": "ruby-gemspec"
                },
                {
                    "name": "syft:location:0:path",
                    "value": "/root/.cache/gem/specs/index.rubygems.org%443/quick/Marshal.4.8/chef-utils-18.5.0.gemspec"
                }
            ]
        },
        {
            "bom-ref": "b8e9734ad545ac63",
            "type": "library",
            "name": "",
            "purl": "pkg:gem/",
            "properties": [
                {
                    "name": "syft:package:foundBy",
                    "value": "ruby-gemspec-cataloger"
                },
                {
                    "name": "syft:package:language",
                    "value": "ruby"
                },
                {
                    "name": "syft:package:type",
                    "value": "gem"
                },
                {
                    "name": "syft:package:metadataType",
                    "value": "ruby-gemspec"
                },
                {
                    "name": "syft:location:0:path",
                    "value": "/root/.cache/gem/specs/index.rubygems.org%443/quick/Marshal.4.8/concurrent-ruby-1.3.4.gemspec"
                }
            ]
        },
        {
            "bom-ref": "678cc9015e228b05",
            "type": "library",
            "name": "",
            "purl": "pkg:gem/",
            "properties": [
                {
                    "name": "syft:package:foundBy",
                    "value": "ruby-gemspec-cataloger"
...

I can push the image somewhere public if it would be helpful to inspect, and/or share the full sbom. I notice in my case, each gem has the same (presumably) incomplete purl, so not 100% sure if this is the same issue as what opened this thread

willmurphyscode commented 4 weeks ago

This might be addressed by https://github.com/anchore/syft/pull/3257 when that is released.

willmurphyscode commented 2 weeks ago

We believe this was fixed by https://github.com/anchore/syft/pull/3257 release in Syft 1.14.0. If we're wrong, please let us know!