anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.21k stars 572 forks source link

Excluded paths are still scanned and cause syft to crash. #3258

Open reure1 opened 1 month ago

reure1 commented 1 month ago

What happened: I excluded paths --exclude ./boot --exclude ./web I had verbose set to 2 When ran you seed WARN unable to access path="/web" lstat /web no such file or directory This is really a permission issue., but it should have never checked.

The bigger issues was the error: unable to get file resolver: unable to create directory resolver: unable to index filesystem path="/boot/efi/EFI/readhat/grub.cfg": lstat /boot/dfi/DFIL permission denied syft exits. This makes the tool useless.

What you expected to happen: The paths excluded would not be scanned.

Steps to reproduce the issue: Use a user that does not have permission to read /boot

Anything else we need to know?:

Environment:

kzantow commented 1 month ago

Hi @reure1 -- the directories you are excluding are at the root of the filesystem but you are using relative paths ./boot and ./web, not absolute paths. If you are scanning an image, you need to use absolute paths. If you are scanning a directory or something else, could you provide some steps to reproduce the issue?

reure1 commented 1 month ago

The tool does not allow absolute paths. I am not using a container, so I'm scanning the whole system, minus some directories. Excluding file paths. You can see in the debug output that every directory I told it to exclude the indexer still indexes, but does not scan. The indexer (resolver) error on the /boot/efi/EFI/redhat/grub.cfg file when it should be excluded. The command I'm running is:

syft scan dir:/ --verbose=2 --exclude=./web --exclude=./boot --exclude=./var/lib/yum --exclude ./dev --exclude./sys -o cyclonedx=sbom.cyclonedx.xml

So what would be better command to scan the system?

kzantow commented 1 month ago

If you are running a directory scan, as you noted, exclude paths need to be relative to the scan root.

I don't have a RHEL instance to test this on at the moment, but running on the latest Fedora, after receiving permission errors such for paths such as /boot, Syft continues to process the remainder of the filesystem and results in an SBOM. And using excludes works as expected for me when specifying them just as you've indicated, e.g.: syft / --exclude ./boot. It does take a little while to run -- it was about 10 minutes in a virtual machine for me.

I should note, I used Syft 1.12.2, maybe give that a try? If that doesn't seem to solve the issue, are there other steps or a publicly available container or ISO that exhibits the problem?

reure1 commented 1 month ago

@kzantow, when you ran your scan did you use -vv or --verbose=2? When you look at the logs you will see directories that you excluded that you don't have permissions to go into. That is where syft is dying. I believe that is part of the indexer/resolver and not the scanner. Again I exclude "--exclude ./boot" the faital is:

unable to get file resolver: unable to create directory resolver: unable to index filesystem path="/boot/efi/EFI/redhat/grub.cfg": lstat /boot/efi/EFI: permission denied"

The fatal error should have never happened because the indexer/resolver should have never gone down that path.
Create a directory at the root level and set the directory where your user does not have permission, Tell syft to exclude the directory, --exclude ./mydir. Turn on verbose -vv or --verbose=2. Run syft and look in the logs and you will see the indexer/resolver trying to access that excluded directory.

reure1 commented 1 month ago

I ran Syft 1.12.2 this morning and the indexer/resolver ignores --exclude also. Syft crashes at the same place. Why it crashes instead of just printing the warning, I don't know. /web is another directory that I excluded and should be skipped.

[002] WARN unable to access path="/web": lstat /web: no such file or directory
kzantow commented 1 month ago

It seems like there are 2 distinct issues here:

  1. For you, Syft is crashing and not continuing to scan when this error occurs
  2. You are seeing paths that you are expected to be ignored

In regards to the first issue: Syft crashing, to me this is the crux of your problems but I am unable to reproduce. I would very much like to ensure this does not happen, but I'll need some way to reproduce instead of speculating on a fix.

In regards to the second issue -- seeing references to /boot. Here are excerpts from the logs when I've run this:

without --exclude

[0000]  WARN unable to access path="/boot/efi": open /boot/efi: permission denied
[0000]  WARN unable to access path="/boot/grub2": open /boot/grub2: permission denied
[0000]  WARN unable to access path="/boot/loader/entries": open /boot/loader/entries: permission denied
[0052]  WARN unable to access path="/var/lib/libvirt/boot": open /var/lib/libvirt/boot: permission denied
[0052]  WARN unable to access path="/boot/grub2/grub.cfg": lstat /boot/grub2/grub.cfg: permission denied
[0058] DEBUG found path duplicate of usr/lib/firmware/nvidia/tu102/gsp/bootloader-535.113.01.bin.xz
[0058] DEBUG found path duplicate of usr/bin/bootctl
[0204] DEBUG file digests cataloger skipping "boot/initramfs-6.8.5-301.fc40.x86_64.img": digests-cataloger unable to observe contents of boot/initramfs-6.8.5-301.fc40.x86_64.img: open /boot/initramfs-6.8.5-301.fc40.x86_64.img: permission denied

with --exclude ./boot

[0088]  WARN unable to access path="/var/lib/libvirt/boot": open /var/lib/libvirt/boot: permission denied
[0088]  WARN unable to access path="/boot/grub2/grub.cfg": lstat /boot/grub2/grub.cfg: permission denied
[0095] DEBUG found path duplicate of usr/lib/firmware/nvidia/tu102/gsp/bootloader-535.113.01.bin.xz
[0095] DEBUG found path duplicate of usr/bin/bootctl

You can see a distinct difference, where syft has excluded many paths as expected, but there still is a reference to /boot/grub2/grub.cfg -- why didn't this get excluded? I'm fairly certain that there are symlinks pointing to files in these directories. Syft is correctly excluding these directories when traversing the filesystem, but when encountering links which point to locations within these excludes, these are not being excluded. I believe Syft is doing the right thing but printing a less-than-useful error message, which loses context of the access path, instead only reporting the resolved absolute path. However, I think this would be a non-issue if Syft was not crashing for you, so I'd really like to get to the bottom of why Syft is crashing before thinking about changing how excludes work.

reure1 commented 1 month ago

@kzantow, thank you for your response. I mostly agree with you. I feel that if syft didn't try to access files it was told to ignore symlink or not, the crash would not happen. I am using a very locked down version of Redhat and I can't share any logs.

kzantow commented 1 month ago

There are probably better ways of finding these, but I was able to identify where the links are on my VM using:

$ for f in $(find / -type l 2>/dev/null); do l="$(readlink $f | grep /boot/grub2)"; if [[ "" != "$l" ]] then echo "$f -> $l"; fi; done
/etc/grub2.cfg -> ../boot/grub2/grub.cfg
/etc/grub2-efi.cfg -> ../boot/grub2/grub.cfg

Update: I tried Rocky and Alma minimal ISO virtual machines, and both of these succeeded without crashing; I don't think we can make a lot more progress without a way to reproduce the issue.

reure1 commented 1 month ago

I will see if we can come up with something. We may just have to use a different SBOM tool.

reure1 commented 1 month ago

@kzantow btw there is a symlink for /etc/grub2.cfg and ../boot/grub2.cfg What are your permissions for /boot and /boot/grub2? On this system: /boot is 555 /boot/grub2 is 700 Both owned by root

kzantow commented 1 month ago

I just did a useradd joe and ran the scan as a non-root user with the default permissions/groups/etc.. The system has the same permissions you've indicated: /boot is 555 (dr-xr-xr-x) and /boot/grub2 is 700 (drwx------).

image

... and the user joe gets a typical permission error accessing these paths:

image
reure1 commented 1 month ago

I started looking at the code. This will give me a chance to learn a little Go. lstat is only on one place. So I will start there.

syft/internal/fileresolver/directory_indexer.go:218
reure1 commented 1 month ago

I had to work on something else for a few days. I'm glad to see someone else had the issue and could easily reproduce it.

reure1 commented 1 month ago

@kzantow, I came up with a quick fix. In syft/internal/fileresolver/directory_indexer.go I changed line 91 from:

return fmt.Errorf("unable to index filesystem path=%q: %w, currentPath, err)

to:

log.WithFields("path", currentPath).Warn("unable to index filesystem path")
continue
willmurphyscode commented 1 month ago

@reure1 that's a good idea! It would be great if we could report this as part of our new "known unknowns" feature which was added by https://github.com/anchore/syft/pull/2998/.

What we would like is for the directory_indexer to be able to use unknowns.Appendf to add to a slice of errors files that it didn't have permissions for, similar to what catalogers do:

https://github.com/anchore/syft/blob/5d165e0230403821efd0608500b7cdd1e66316c5/syft/pkg/cataloger/golang/scan_binary.go#L48-L51

However, there might be additional wiring needed to make this work from inside a resolver (above link is to a cataloger, not a resolver).

If you'd like to pick this up, just let us know! We're happy to help.