anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.1k stars 562 forks source link

Detect linux distro when not scanning root #3253

Open chovanecadam opened 3 weeks ago

chovanecadam commented 3 weeks ago

What would you like to be added:

Quoting @ariel-miculas from #3145:

I would love if I could do:

syft scan -o syft-json dir:/etc --file-scan-root=/

and get the distro identification because the files in /etc/ are prefixed with /etc/, so syft would find /etc/os-release(assuming /etc/os-release is a regular file and not a symlink).

Why is this needed:

Make syft recognize Linux distribution when scanning a directory. Currently this is not possible (see https://github.com/anchore/syft/issues/3145#issuecomment-2307611963)

wagoodman commented 3 weeks ago

On the surface this seems like a duplicate of #3213 , since this is ultimately about being able to tailor the search context within some larger reference context.

However...

Make syft recognize Linux distribution when scanning a directory

is a more specific ask, and taking a look at the example CLI, it doesn't really square up with #3213:

syft scan -o syft-json dir:/etc --file-scan-root=/

Here we're already scanning /etc, so we should be able to find it. However I was surprised to see this didn't work:

$ syft /etc -o json | jq '.distro'
 ✔ Indexed file system      
 ✔ Cataloged contents     
   ├── ✔ Packages                        [0 packages]
   └── ✔ Executables                     [0 executables]

{}

$ syft / -o json | jq '.distro'
 ✔ Indexed file system  
 ✔ Cataloged contents 
   ├── ✔ Packages                        [418 packages]
   ├── ✔ File digests                    [6,920 files]
   ├── ✔ File metadata                   [6,920 locations]
   └── ✔ Executables                     [570 executables]

{
  "prettyName": "Fedora Linux 40 (Container Image)",
  "name": "Fedora Linux",
  "id": "fedora",
  "version": "40 (Container Image)",
  "versionID": "40",
  "variant": "Container Image",
  "variantID": "container",
  "homeURL": "https://fedoraproject.org/",
  "supportURL": "https://ask.fedoraproject.org/",
  "bugReportURL": "https://bugzilla.redhat.com/",
  "cpeName": "cpe:/o:fedoraproject:fedora:40",
  "supportEnd": "2025-05-13"
}

This is due to this: https://github.com/anchore/syft/blob/963ea594c8ae4e294d07148e9f17d1149fe43dfb/syft/linux/identify_release.go#L28-L54

we should probably change these references to **/os-release (and the same for other patterns) so that we can match even when given /etc to scan.

If you're looking to scan /somewhere/else and still reference /etc/ to get the distro info, that's what #3213 is about.

I'm going to narrowly interpret this issue based on the example, implying we should fix these env search paths to globs and some surrounding cleanup. Shout out if this interpretation + #3213 still does not cover your use cases.