anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
5.85k stars 536 forks source link

Recognition of files in a folder works inconsistently between Linux distributions. #2808

Open jhojczak opened 3 months ago

jhojczak commented 3 months ago

What happened: Syft does not recognize binary files on archlinux that are recognized on rockylinux even though the contents of the folder are identical.

I have prepared a script that reproduces this behavior.

The script using 'incus' starts two VMs with different Linux distributions (rockylinux and archlinux) and runs syft from a container inside the VMs to scan the folder. The folder contains the unpacked docker-ce rpm package. I decided to unpack the rpm before scanning because the purl/cpe generated by syft from the packed package does not allow finding CVEs assigned to docker. Which in most databases are either assigned to the moby project or to the github/docker/docker repository or prul pkg:rpm/docker repository.

What you expected to happen: Syft should produce the same report from folders containing the same files on both Linux distributions.

Steps to reproduce the issue:

#/bin/bash
set -x
VM_ARCH=arch-tmp
VM_ROCKY=rocky-tmp
incus profile create vm-profile
cat <<EOF >vm-profile.yaml
config:
  security.secureboot: "false"
description: ""
devices:
  agent:
    source: agent:config
    type: disk
  eth0:
    nictype: bridged
    parent: incusbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
EOF
incus profile edit vm-profile <vm-profile.yaml
incus launch images:archlinux ${VM_ARCH} --vm -p vm-profile
incus launch images:rockylinux/8/cloud ${VM_ROCKY} --vm -p vm-profile
sleep 15

incus exec ${VM_ARCH} -- pacman --noconfirm -Sy docker tree
incus exec ${VM_ARCH} -- systemctl start docker

incus exec ${VM_ROCKY} -- dnf install -y 'dnf-command(config-manager)' curl tree
incus exec ${VM_ROCKY} -- dnf config-manager --add-repo 'https://download.docker.com/linux/centos/docker-ce.repo'
incus exec ${VM_ROCKY} -- dnf install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
incus exec ${VM_ROCKY} -- systemctl start docker
incus exec ${VM_ROCKY} -- curl -Lo /tmp/docker-ce-23.0.2-1.el8.x86_64.rpm https://download.docker.com/linux/centos/8/x86_64/stable/Packages/docker-ce-23.0.2-1.el8.x86_64.rpm
incus exec ${VM_ROCKY} -- mkdir /tmp/rpm
incus exec --cwd /tmp/rpm ${VM_ROCKY} -- bash -c "rpm2cpio /tmp/docker-ce-23.0.2-1.el8.x86_64.rpm | cpio -idmv"
incus file pull --recursive ${VM_ROCKY}/tmp/rpm ./
incus file push --recursive ./rpm ${VM_ARCH}/tmp
echo "=== Rockylinux ==="
incus exec ${VM_ROCKY} -- tree /tmp/rpm
incus exec ${VM_ROCKY} -- docker run --rm -v /tmp/rpm:/target anchore/syft:v1.2.0 scan dir:/target
echo "=== Archlinux ==="
incus exec ${VM_ARCH} -- tree /tmp/rpm
incus exec ${VM_ARCH} -- docker run --rm -v /tmp/rpm:/target anchore/syft:v1.2.0 scan dir:/target

Anything else we need to know?: To run the script, you must have 'incus' or lxd installed with the ability to create virtual machines. In the case of lxd, replace the 'incus' command with lxc in the script. Environment:

wagoodman commented 3 months ago

I wasn't able to reproduce what you are seeing, specifically doing the equivalent of your script yields the same SBOM for me:

❯ tree
.
├── arch-mount
│   ├── md5sum
│   └── sbom.json
└── rocky-mount
    ├── md5sum
    └── sbom.json

3 directories, 4 files

❯ diff arch-mount/md5sum rocky-mount/md5sum

❯ diff arch-mount/sbom.json rocky-mount/sbom.json

I've attached both (identical) SBOMs to this comment below.

arch-sbom.json rocky-sbom.json

I didn't use your script exactly since the real steps it appeared you were trying to get across were (using docker instead):

# from rocky
curl -Lo /tmp/docker-ce-23.0.2-1.el8.x86_64.rpm https://download.docker.com/linux/centos/8/x86_64/stable/Packages/docker-ce-23.0.2-1.el8.x86_64.rpm
mkdir /tmp/rpm
rpm2cpio /tmp/docker-ce-23.0.2-1.el8.x86_64.rpm | cpio -idmv
# ... now there is a populated /tmp/rpm dir
syft dir:/tmp/rpm -o json > /volumemount/rocky.sbom.json

# from the host
docker cp alpinectrid:/tmp/rpm ./rpm
docker cp ./rpm rockyctrid:/tmp

# from alpine
syft dir:/tmp/rpm -o json > /volumemount/alpine.sbom.json

The differences from your script and what I did were:

One of these differences might be a sensitive factor, so I can try and repeat this again and report back.