vmware-archive / containers-with-sboms

A demo project for building containers with corresponding Software Bill of Materials
Other
12 stars 2 forks source link

Question! #1

Open vsoch opened 2 years ago

vsoch commented 2 years ago

hey @nishakm ! I just ran the example (without vagrant) on my cluster and (yay!) tern appears to not require sudo anymore, at least for this example. So to summarize the process, it was:

  1. generate or dump container contents into a directory
  2. run tern to generate json outputs
  3. upload the container and sbom to a registry with oras
  4. sign with cosign

So my questions are:

  1. Does it have to be required to dump contents into a directory? I'm guessing this is a common "format" across container images (e.g., Singularity images can be dumped into a directory too). But of course it's an extra step.
  2. How would this work for registries that don't support oras/sbom?
  3. Is there any discussion about packaging the sbom with the image (or does that not make sense?)

I'm trying to figure out how to fit this into my workflows - e.g., I could make a GitHub workflow that can easily build then dump the container and push to GitHub packages, and this would work for docker or singularity, but I'm not sure what happens after that?

vsoch commented 2 years ago

Oh oops one more question! So I see:

{"SPDXID": "SPDXRef-DOCUMENT", "spdxVersion": "SPDX-2.2", "creationInfo": {"created": "2021-11-19T19:09:53Z", "creators": "Tool: tern-9751f1b017276232ea60ebb9e8c694187986f99c", "licenseListVersion": "3.8"}, "name": "Tern SPDX JSON SBoM", "dataLicense": "CC0-1.0", "comment": "This document was generated by the Tern Project: https://github.com/tern-tools/tern", "documentNamespace": "https://spdx.org/spdxdocs/tern-report-2021-11-19T19:09:53Z-3d8bd63b-54be-45fb-ad7b-41c5e5536f98", "documentDescribes": [], "packages": [], "relationships": []}

and packages - how are packages populated? E.g., if we have a container with spack (an HPC package manager) how do I get packages added there? If this is something tern handles should I look and PR there?

nishakm commented 2 years ago

Hey @vsoch! Sorry to get back to you so late 😅

1. generate or dump container contents into a directory

2. run tern to generate json outputs

3. upload the container and sbom to a registry with oras

4. sign with cosign

So my questions are:

1. Does it have to be required to dump contents into a directory? I'm guessing this is a common "format" across container images (e.g., Singularity images can be dumped into a directory too). But of course it's an extra step.

Generally container builders have some kind of mount point where the dump the filesystem used by the storage driver. I am not sure how Singularity containers are spun up, but if you are using buildah, the filesystem directory should be available.

2. How would this work for registries that don't support oras/sbom?

The SBOM can be stored anywhere. You can use crane to push an sbom to a registry too. Or you can upload it to github. What's nice about ORAS Artifacts is that you can group multiple sboms together and transport them to different registries.

3. Is there any discussion about packaging the sbom with the image (or does that not make sense?)

You could. Some tools add the SBOM file as a different layer in the container. I personally don't see this as helping with SBOM reuse for specific components and I would rather encourage folks to share SBOMs rather than generate them every single time.

how are packages populated? E.g., if we have a container with spack (an HPC package manager) how do I get packages >added there? If this is something tern handles should I look and PR there?

It would be awesome if you can file an issue and submit a PR at tern-tools/tern :). We have some documentation on how to enable inventorying: https://github.com/tern-tools/tern/blob/main/docs/adding-to-command-library.md

We can use the issue to communicate if you have any questions or trouble.

vsoch commented 2 years ago

Ah it's totally okay! That was right before the holiday, and actually I should have just dived into sboms and exploration on my own before asking so many questions.

Generally container builders have some kind of mount point where the dump the filesystem used by the storage driver. I am not sure how Singularity containers are spun up, but if you are using buildah, the filesystem directory should be available.

Oh yes that totally makes sense! And Singularity would be very similar - the only trick I ever figured out to get around that is do that entire process in memory, and that obviously only worked for smaller containers. I made a silly library in graduate school to do exactly that, and make what I called "container trees" https://singularityhub.github.io/container-tree/examples/index.html. So even though Singularity is a single SIF, it's exactly the same - you have to sploot everything somewhere first. For some reason I was thinking that maybe there was a fiendish trick to avoid that.

The SBOM can be stored anywhere. You can use crane to push an sbom to a registry too. Or you can upload it to github. What's nice about ORAS Artifacts is that you can group multiple sboms together and transport them to different registries.

I totally get this now - it's just a json file, and one that we happen to be able to store in a registry (but certainly does not need to be!) I'm probably going to wait to see if/how the spack community responds, but I could so easily generate them to go alongside packages at https://spack.github.io/packages. I think I'd want more discussion around the details I chose, and how to increment the versions (and indeed when I contribute here I'll bring in more eyes to look this over).

It would be awesome if you can file an issue and submit a PR at tern-tools/tern :). We have some documentation on how to enable inventorying: https://github.com/tern-tools/tern/blob/main/docs/adding-to-command-library.md

Ooooh I'd love to! This kind of work is seriously my favorite thing. I'm looking quickly at the command library, and it seems to support adding base concepts (e.g. a package manager in ubuntu) and then snippets (commands to run given a known commad). With most containers with spack, spack is not certain to be on the path somewhere, so I'd want some logic that looks like this:

  1. Look for spack executable or SPACK_ROOT exit 0 if not found
  2. Given found spack, run a custom script that uses spack python to derive metadata for packages (and this could be output / saved really anywhere).

What would be the best way, given the current structure, to go about that? Could tern have a directory of supported wrappers / scripts to run instead of a base package manager like apt to derive the packages?

nishakm commented 2 years ago

With most containers with spack, spack is not certain to be on the path somewhere, so I'd want some logic that looks like this:

Look for spack executable or SPACK_ROOT exit 0 if not found

We have logic that looks for the executable in paths defined in base.yml Eg: in https://github.com/tern-tools/tern/tree/main/tern/analyze/default/command_lib

spack:
  pkg_format: 'spack'
  os_guess:
    - 'None'
  path:
    - 'usr/local/spack'
    - 'bin/spack'

Tern will automatically look for the binary "spack" in usr/local/spack and bin/spack.

Given found spack, run a custom script that uses spack python to derive metadata for packages (and this could be output / saved really anywhere).

Sending the output to stdout is the easiest way to do this. Eg:

  names:
    invoke:
      1:  
        container:
          - "spack list | cut -d ' ' -f 1"
    delimiter: "\n"

or

  names:
    invoke:
      1:  
        host:
          - "script_available_on_fs.sh"
    delimiter: "\n"

Currently there isn't a way to call a script that isn't part of the filesystem, but it's possible to include that as a feature if there isn't any other way to collect metadata.