anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.1k stars 562 forks source link

When generating SBOM for a directory, no download location values can be found #2085

Open Auston-Zhang opened 1 year ago

Auston-Zhang commented 1 year ago

What happened:

When generating SBOM for a directory, all download location values are "NOASSERTOIN", even if the ecosystem is Javascript/NPM

What you expected to happen:

We should see download location values (URL) in the generated SBOM file.

Take package @pkgjs/parseargs as an example, it has url, so in the SBOM file we should see the value for download locatoin as git@github.com:pkgjs/parseargs.git

"repository": { "type": "git", "url": "git@github.com:pkgjs/parseargs.git" }

Steps to reproduce the issue:

  1. clone a repo https://github.com/airbnb/javascript
  2. run 'npm install' in the folder of the cloned repo
  3. run syft, in my case the command is 'syft C:\test-syft\javascript -o spdx-json > test-after-npm-install.json'
  4. open the generated SBOM file (attached) with an editor/IDE, search ' "downloadLocation": " ', see the count
  5. search ' "downloadLocation": "NOASSERTOIN" ', the count is the same as the previous count, which means all the download location values are "NOASSERTION"

Anything else we need to know?:

Not sure if I should call it a bug or it is expected behavior. If it is expected behaviour, could you please let me know? Thanks!

Github does not support uploading json file, so sharing it with the Google drive link: https://drive.google.com/file/d/1kUxQFMoihrpOXvwxt5di6Mv7BXnWSLcN/view?usp=sharing

Environment:

Auston-Zhang commented 1 year ago

To give more context, after checking the code, I feel Syft is running in this way: (This only applies to the JavaScript ecosystem, haven't looked into other ecosystems yet, of course lots of work will be done for other ecosystems)

  1. if scanning a directory, the cataloger will check package-lock.json
  2. if scanning an image, the cataloger will check package.json

not sure if it is a quick fix, https://github.com/anchore/syft/blob/007b034ee38063fd5b41c82741e7561448dc817d/syft/formats/common/spdxhelpers/download_location.go#L17

in the source code, adding the code snippet below,

case pkg.NpmPackageLockJSONMetadata:
  return NoneIfEmpty(metadata.Resolved)

which will look like

if hasMetadata(p) {
        switch metadata := p.Metadata.(type) {
        case pkg.ApkMetadata:
            return NoneIfEmpty(metadata.URL)
        case pkg.NpmPackageJSONMetadata:
            return NoneIfEmpty(metadata.URL)
                // new code added
                case pkg.NpmPackageLockJSONMetadata:
                        return NoneIfEmpty(metadata.Resolved)
        }
    }

and seems it can make the SBOM (directory) has downloadLocation values