anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
5.73k stars 526 forks source link

"none" under file selection in configuration doesn't work as expected #2989

Open tomersein opened 4 days ago

tomersein commented 4 days ago

What happened: When I use "none" I still get "files" entry in the final json. What you expected to happen: If I use "none" remove the "files" entry. Steps to reproduce the issue: use this config.yaml:

file:

   metadata: 
      # select which files should be captured by the file-metadata cataloger and included in the SBOM. 
      # Options include:
      #  - "all": capture all files from the search space
      #  - "owned-by-package": capture only files owned by packages
      #  - "none", "": do not capture any files
      # SYFT_FILE_METADATA_SELECTION env var
      selection: "none"

scan an image\directory Anything else we need to know?: did a little check and the issue is this function: func toFile(s sbom.SBOM) []model.File I think that in case of none it shouldn't enter this function or use skip (if all variables like metadata, digest, etc.) are empty.

Environment:

kzantow commented 3 days ago

I can confirm there seems to be something unexpected happening here:

SYFT_FILE_METADATA_SELECTION=none syft alpine:latest -o json

It results in a files section with no metadata or any other information such as digests:

  "files": [
    {
      "id": "a74cadfe8cda7a82",
      "location": {
        "path": "/bin/busybox",
        "layerID": "sha256:02f2bcb26af5ea6d185dcf509dc795746d907ae10c53918b6944ac85447a0c72"
      }
    },
   ...

For what it's worth: I think it might make sense for this flag to prevent metadata from being captured, rather than preventing files from being captured, and perhaps we should think about introducing a new configuration for the entire file section to disable all file data collection, e.g.:

file:
  # enable file cataloging
  enabled: true
  - or -
  selection: ...

  metadata:
    # select which files should be captured by the file-metadata cataloger and included in the SBOM. 
    # Options include:
    #  - "all": capture all files from the search space
    #  - "owned-by-package": capture only files owned by packages
    #  - "none", "": do not capture any files (env: SYFT_FILE_METADATA_SELECTION)
    selection: 'owned-by-package'

    # the file digest algorithms to use when cataloging files (options: "md5", "sha1", "sha224", "sha256", "sha384", "sha512") (env: SYFT_FILE_METADATA_DIGESTS)
    digests: 
      - 'sha1'
      - 'sha256'    
   ...