Open deitch opened 1 year ago
@deitch do you have a good sample of the data you're trying to add?
The reason I ask this is syft has a way of adding packages to an SBOM being generated from other SBOM files: https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/sbom/cataloger.go
If you include an SBOM within the directory or image in the format of: https://github.com/anchore/syft/blob/17aa8287e6265e5ea65c8946bd790c2cbf444172/syft/pkg/cataloger/sbom/cataloger.go#L14-L30
Then syft will pick it up and add the packages into the generated SBOM.
However, this doesn't cover things like docker information which is what it seemed your question was about.
Apologies for the indirect answer, but if you want to add data to the SBOM generated then this is the best way for now.
Happy to take other suggestions or hear if this meets your use case!
No apologies, this is pretty good.
So are you saying, if I add an SBoM in one of the above formats - which looks like it covers syft, syft-json, spdx, spdx-json, cyclonedx, cyclonedx-json - it should include it?
Two questions:
do you have a good sample of the data you're trying to add?
Just what I listed above, and it isn't in anything I can show you yet:
we enforce all docker builds with --network=none, so that the only ways you can get things in off the network is via ADD. This, in turn, lets us scan Dockerfiles and capture all ADD commands
I wrote a simple scanner that can be given one or more Dockerfiles, it looks for ADD
commands and spits those out as packages. I think I did it to support spdx, spdx-json and maybe one or two others. These Dockerfiles make up lots of OCI images which are then unpacked and bundled together. syft does a pretty good job scanning the final output, but it sometimes misses things that have been ADD
ed. It is ~20 Dockerfiles, so I want to:
I would be happy to throw my tool out if syft could replace it. That would mean:
ADD
commandsBut even if it did, these two ideas enhance Syft's usability:
Hi @deitch, we're looking at some older issues here, and we're wondering what packages Syft are not picking up in the above example? We want to start looking at ways to improve Syft so that users don't have to bring your own data, so that it can discover these packages for you without manual intervention. Sorry for the long delay between replies! If you want to pick this thread up again let us know, and we can discuss. Otherwise let me know if if maybe it's OK to close this issue. Thanks!
Hi @tgerla . Yeah, I remember this one. The solution by @spiffcs worked reasonably well.
The example I gave above pretty much covered it. It is one of two cases:
let me know if if maybe it's OK to close this issue
I have two questions:
For the latter, here is an example. I want to create an SBoM for an OCI image by scanning it with syft. I also want to include some additional data via an SBoM I already have. I cannot quite "include" my SBoM in the image, as it is composed of a bunch of layers of tgz. Theoretically, I could expand it all out and then add it and scan the dir, but who wants that awful user experience?
Something like:
syft packages oci:myimage/foo:bar --append path/to/sbom.json --append path/to/other/sbom.json
Or this even might work, if slightly less convenient
syft packages oci:myimage/foo:bar path/to/sbom.json path/to/other/sbom.json
In the first, I am saying, "scan that image, but also append the following sboms". In the second, I am saying, "use your usual scanning goodness, but scan that image, and this directory/file, and that directory/file."
Hey @deitch, thanks again for the details. I think your "--append" idea is a good one, and we understand how it might be useful rather than having to embed separate SBOMs on your images to be scanned. I'll go ahead and move this into the feature backlog.
This is somewhat related to:
What would you like to be added:
How can I tell syft, "add this data to the SBoM you generate"?
Scenario: I have data that I know syft will not find. Perhaps it is obscure, our own inputs, etc. I still want to use syft to scan everything, but I also want the resultant SBoM to include the additional data.
As an example, in one project, we enforce all docker builds with
--network=none
, so that the only ways you can get things in off the network is viaADD
. This, in turn, lets us scan Dockerfiles and capture allADD
commands.What would be the right way to augment the complete syft output with this data? I could see one of the following (but open to others):
Why is this needed:
syft never will capture 100% of everything; having a mechanism for telling syft, "add this data" would expand its capabilities.