microsoft / component-detection

Scans your project to determine what components you use
MIT License
439 stars 91 forks source link

Component detection does not scan Linux file system if the image is not a docker image. #691

Open romahamu opened 1 year ago

romahamu commented 1 year ago

Hello,

We were trying to create SBOM for our production image using sbom-tool but it seems we have hit a blocker.

Our image is not created using docker, its a VM image created using packer.

SBOM tool uses component-detection tool to get the dependencies and for Linux it seems only way is passing docker image to the tool.

We tried running to component-detection tool directly on the VM from which we create the image but it does not capture any Linux packages installed on the image.

Linux scanner logs "No instructions received to scan docker images." and then returns.

syft tool does support scanning of a file system which is what component-detection tool uses for Linux scanner.

What is the workaround to detect Linux packages from the filesystem? We are blocked on this currently and unable to meet our SBOM requirement.

AB#2088307

melotic commented 1 year ago

Unfortunately, this is by design. Component Detection does not support scanning Linux file systems, only docker containers at the moment.

We can investigate utilizing Syft to also scan the current Linux file system with a CLI flag, but this isn't currently on our roadmap. This is also a bit tricky as Syft will also pick up dependencies similiar to Component Detection, but does not provide a graph output.

As a workaround, can you just use Syft to generate your SBOM?

romahamu commented 1 year ago

Unfortunately for the same reason SBOM generated by syft is not according to schema for microsoft's sbom requirements. Also, it seems component detection tool parses the output produced by syft to generate the cgmanifest.

melotic commented 1 year ago

We discussed in our Community Meeting that in the future we might forgo a dependency on Syft and manually scan container images ourselfves. The issue with using Syft is the output is in a different format than CD uses internally, and can be nondeterministic depending on the scan location.