crc-org / crc

CRC is a tool to help you run containers. It manages a local OpenShift 4.x cluster, Microshift or a Podman VM optimized for testing and development purposes
https://crc.dev
Apache License 2.0
1.23k stars 233 forks source link

[Epic] Decouple bundles from released crc binary #3206

Open gbraad opened 2 years ago

gbraad commented 2 years ago
praveenkumar commented 2 years ago

Thinking it about other day what we can do is to have single downstream crc release with a major openshift release and for z stream bundles we just do the publish the bundles. Like if we release crc-2.4.0 downstream with 4,10,14 bundle then till 4.11.x release happen we can just ship the 4.10.x bundle without breaking anything around crc-2.4.0. We do need to adjust our preflight checks for release to only check for major.minor version not the patch one. Also we do need to have standard bundle download path which always provide the latest bundle.

cfergeau commented 2 years ago

The main difficulty with using multiple bundles with a single crc release is how to verify the bundle 'authenticity'. Currently we hardcode the bundle sha256sum in the binary, and verify that. Since the binary is signed, we can trust the sha256sum, if the bundle we download has the correct sha256sum, then we can trust the downloaded bundle.

This will have to be done differently if we want to be able to use bundle which aren't released yet when the crc binary is released. One option could be big warnings to the user that we can't verify the bundle, they use it at their own risk, ... Another option is to add some cryptographic signatures, which adds some complication of its own. It's also possible to sign container images, maybe that could be a way of releasing these bundles.

adrianriobo commented 2 years ago

Just thinking out loud....create the shasumfile and cipher the content with a crc private key, then download the shasumfile decipher with the public (ensure the auth) then caculate the shasum of the bundle and comapre with the deciphered value?

On the other hand...what about dealing the 30 days period for certs..should we automate rebuild versions and upload them every 30 days?

cfergeau commented 2 years ago

Yeah, it is the key management which can be painful, and figuring out the best way of getting a file signed/uploaded.

praveenkumar commented 2 years ago

On the other hand...what about dealing the 30 days period for certs..should we automate rebuild versions and upload them every 30 days?

That would mean for a specific crc release would be stuck to specific Z-stream but the issue remain same as soon as you rebuild same version of bundle the shashum still change.

adrianriobo commented 2 years ago

That would mean for a specific crc release would be stuck to specific Z-stream but the issue remain same as soon as you rebuild same version of bundle the shashum still change.

Not sure how this is related...the 30 days bundle recreation meaning avoid cert renewal on start....

No problem with new shasum...as process keep the same and it works....

bundle generation (server internal) side

build bundle -> calculate shasum -> cipher the shasum (with private crc-team key pair) -> upload bundle and ciphered shasumfile

crc (binary) side

download shasumfile -> decipher content (with public crc-team key pair)-> if content can be deciphered (crc-team is the originator) -> download bundle -> calculate shasum -> compare shasums

praveenkumar commented 2 years ago

@adrianriobo Then, I misunderstood your comment :(

cfergeau commented 2 years ago

We are still evaluating whether to go with signed bundles/checksum file, or just limit ourselves to https security.

gbraad commented 2 years ago

We should also consider the behaviour when an older binary is downloaded. Should we allow the download of an old bundl;e, or refer to the latest?

gbraad commented 2 years ago

Simplistic approach would be to point to a metadata file:

                             mirrors.redhat.com
CRC 2.04                     -> crc/2.04/metadata.json

                                bundle location -> 
                                2nd bundle ->

2.05                         -> crc/2.05/metadata.json
praveenkumar commented 2 years ago

I am experimenting with container registry where we create a container image which only have bundle as different layer and then push it to quay.io under crcont org then we can tag those bundle to specific crc release to which it can be supported.

$ cat Containerfile 
FROM scratch
COPY crc_libvirt_4.10.14_amd64.crcbundle /

$ podman build -t quay.io/praveenkumar/crc_libvirt_amd64:<bundle_version> -f Containerfile 
$ podman tag cquay.io/praveenkumar/crc_libvirt_amd64:<bundle_version> quay.io/praveenkumar/crc_libvirt_amd64:<crc_version>
$ podman push quay.io/praveenkumar/crc_libvirt_amd64:<bundle_version>
$ podman push quay.io/praveenkumar/crc_libvirt_amd64:<crc_version>

From crc side we need to implement around pulling an image from a registry using containers/image package and extract the layer to get crc bundle and then consume it.

Following is what I tried with skopeo

$ skopeo copy  docker://quay.io/praveenkumar/crc_libvirt_amd64:<crc_version> dir:/home/prkumar/bundle_data/

$ ls -l bundle_data/
total 3299560
-rw-r--r--. 1 prkumar prkumar 3378735102 Jun  9 07:17 1bd0b372932c4bbce5c61cf75a40f0fdb78827a95baefd749c1300150570a45b
-rw-r--r--. 1 prkumar prkumar        493 Jun  9 07:17 7856ae0ce9a943d0bb38e1bba24c4b7dc2e887421d2a80c4116a453e0fcbfbfe
-rw-r--r--. 1 prkumar prkumar        506 Jun  9 07:17 manifest.json
-rw-r--r--. 1 prkumar prkumar         33 Jun  9 07:16 version

$ jq . bundle_data/manifest.json 
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:7856ae0ce9a943d0bb38e1bba24c4b7dc2e887421d2a80c4116a453e0fcbfbfe",
    "size": 493
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:1bd0b372932c4bbce5c61cf75a40f0fdb78827a95baefd749c1300150570a45b",
      "size": 3378735102
    }
  ],
  "annotations": {
    "org.opencontainers.image.base.digest": "",
    "org.opencontainers.image.base.name": ""
  }
}
$ tar -xvf bundle_data/1bd0b372932c4bbce5c61cf75a40f0fdb78827a95baefd749c1300150570a45b 
crc_libvirt_4.10.14_amd64.crcbundle

We have to figure it out around how to warn user if the new bundle available for a specific release (May be when cert is expired then along with the cert warning we will also warn user to pull new version of bundle by deleting old from cache? )

We also need to change the logic preflight around bundle and use just . regex to download latest bundle it not available in the cache folder.

cfergeau commented 2 years ago

$ podman push quay.io/praveenkumar/crc_libvirt_amd64:<bundle_version>

We don't need the arch suffix, we can use multiarch manifests. I'd go with libvirt-bundle or such, crc is already part of the org name.

We also need to change the logic preflight around bundle and use just . regex to download latest bundle it not available in the cache folder.

This depends on the outcome of this discussion https://github.com/code-ready/crc/issues/3206#issuecomment-1150700777

cfergeau commented 2 years ago

Ideally we'd have a way to get these bundles signed with a Red Hat key? https://cloud.redhat.com/blog/signing-and-verifying-container-images Or do we just go with https security (ie check we connect to the expected server), but decide we don't need to check bundle integrity (ie no check that the downloaded data really comes from us)

praveenkumar commented 2 years ago

Ideally we'd have a way to get these bundles signed with a Red Hat key? https://cloud.redhat.com/blog/signing-and-verifying-container-images Or do we just go with https security (ie check we connect to the expected server), but decide we don't need to check bundle integrity (ie no check that the downloaded data really comes from us)

Yesterday during meeting we all agreed to go with https security initially (either it would be mirror or to a registry). If the plan is to sign the bundles with Red Hat key then we have to read further around the process.

cfergeau commented 2 years ago

Yesterday during meeting we all agreed to go with https security initially

To be honest, I left the meeting confused about what we wanted ^^

praveenkumar commented 2 years ago

I was thinking to use cosign ( https://docs.sigstore.dev/cosign/overview ) to have image signed and pushed to quay (that part works) but as of now on container/image doesn't have a way to verify it. There is a draft PR https://github.com/containers/image/pull/1364 but still not merged. With that I think it would be much easy to have that secure supply chain of bundles.

cfergeau commented 2 years ago

https://cloud.redhat.com/blog/signing-and-verifying-container-images explains how to push images together with their signature. coreos also has code to verify these signatures.

praveenkumar commented 2 years ago

Another thing which as of now I am kind of stuck around is to get only tag info which a release version points to figure out which version of bundle we are going to fetch/pull so that we can use this info to build directory structure for getting bundle metadata info.

My plan is to tag image with bundle version and also with release crc version something mentioned earlier (https://github.com/code-ready/crc/issues/3206#issuecomment-1150846396) .

$ podman build -t quay.io/praveenkumar/crc_libvirt_amd64:4.10.18 -f Containerfile 
$ podman tag quay.io/praveenkumar/crc_libvirt_amd64:<bundle_version> quay.io/praveenkumar/crc_libvirt_amd64:2.5.0
$ podman push quay.io/praveenkumar/crc_libvirt_amd64:4.10.18
$ podman push quay.io/praveenkumar/crc_libvirt_amd64:2.5.0

Here 4.10.18 and 2.5.0 both points to same image and in future when we have 4.10.19 bundle we again tag it to 2.5.0 so that on crc side we can just use crc release tag 2.5.0 to pull the bundles and not the bundle tag but we need a way to figure out which other tags point to same image which 2.5.0 pointing to so we can get bundle version info dynamically or as part of metadata.

Looking at the skopeo I can get the all the tags which a container image have on a specific registry but it doesn't provide which tags are point to same image hash so we might need to wire up something ourself https://docs.docker.com/registry/spec/api/#listing-image-tags is what registry API provides so we can querying each and every tag and sort them by the returned digest and then find it. I am putting here in case someone have better idea to handle it ( If possible then without the bundle versions ).

$ skopeo inspect docker://quay.io/praveenkumar/sample:2.4
{
    "Name": "quay.io/praveenkumar/sample",
    "Digest": "sha256:124b84e904576c7a06929e78d8bb8e0f7f8b0c8a1590a48f45e81fbea572928f",
    "RepoTags": [
        "v1",
        "2.4",
        "v2"
    ], ...
praveenkumar commented 2 years ago

https://gist.github.com/praveenkumar/f3e192cc1af339add8fb3a2e6761eb5d have a working snip which print out for a specific release what would the respective latest bundle version available. It will add another layer of network dependency so now we have to fetch the tag data from quay and parse it. What would be the action if user is not connected to internet but already have a bundle (may be not the latest one)?

cfergeau commented 2 years ago

My understanding is that the tagging would allow us to know "bundle version xx was tested with crc version yy". Because of the cert renewal issue, users won't want to be limited to bundle version xx when they use crc version yy, but they'll want to use a bundle which is less than a month old. From crc developers' perspective, we are not only supporting the latest crc version, so once a new crc version is released, what works/does not work with an older version is not a priority for us. As long as it's possible to override the bundle version being used, this should be workable

Since this tagging seems to be adding some complexity to solve a corner case, I'd start with something simpler and don't try to do this at all while we figure out the exact use case/what we want to support.

praveenkumar commented 2 years ago

My understanding is that the tagging would allow us to know "bundle version xx was tested with crc version yy". Because of the cert renewal issue, users won't want to be limited to bundle version xx when they use crc version yy, but they'll want to use a bundle which is less than a month old. From crc developers' perspective, we are not only supporting the latest crc version, so once a new crc version is released, what works/does not work with an older version is not a priority for us. As long as it's possible to override the bundle version being used, this should be workable

We want to solve something like following to have less frequent release downstream.

crc-version          bundle-version
2.5.0                   4.10.11
                        4.10.12 (we tested it with 2.5.0 and it works and also 4.10.11 have cert expired)
                        4.10.13 ....

Now if there is existing user of 2.5.0 version then he already have 4.10.11 bundle version but it has now expired cert and 4.10.13 available for him (how do we tell user to download the latest bundle, through a warning and carry on using 4.10.11 bundle or automatic download 4.10.13 bundle). I do agree as soon as we release 2.6.0 release of crc then whatever the latest bundle would be working for it but we still need mapping around those bundles with crc release.

Since this tagging seems to be adding some complexity to solve a corner case, I'd start with something simpler and don't try to do this at all while we figure out the exact use case/what we want to support.

Something simpler you mean just have same capability what we have today but download/fetch it from the quay instead of mirror? Because even then just of using bundle versions as tags we do need to fetch the manifest to verify the existing bundle available or not in cache folder.

cfergeau commented 2 years ago

Something simpler you mean just have same capability what we have today but download/fetch it from the quay instead of mirror?

Yes, current crc releases hardcode the bundle version they expect (pkg/crc/version.bundleVersion), they could use this same version to download from quay instead of from mirror.openshift.com

Because even then just of using bundle versions as tags we do need to fetch the manifest to verify the existing bundle available or not in cache folder.

Future releases could check whether crc_libvirt_$bundleVersion is unpacked or not in cache folder. Or it could use the latest 4.10.x version available in the cache folder as long as it's newer than $bundleVersion. Or it could use the version specified on the commandline, downloading it from quay if it's not there. Or it could fetch the latest 4.10.x version available on quay, either automatically or when instructed to do so (something that needs to be decided).

cfergeau commented 2 years ago

Now if there is existing user of 2.5.0 version then he already have 4.10.11 bundle version but it has now expired cert and 4.10.13 available for him (how do we tell user to download the latest bundle, through a warning and carry on using 4.10.11 bundle or automatic download 4.10.13 bundle). I do agree as soon as we release 2.6.0 release of crc then whatever the latest bundle would be working for it but we still need mapping around those bundles with crc release.

I would initially just go with the possibility to use

The scheme you describe is useful if we expect bundles to no longer work with crc version yy when we release crc version zz with a new bundle. I'd rather be optimistic, and assume we'll manage to keep bundles compatible for a whole OpenShift 4.10.* release, in which case we don't need to implement any sophisticated tagging/tag checking/...

praveenkumar commented 2 years ago

I am going to create tasks from this epic. First task would be don't change anything on bundle version logic but just download/pull it from quay.io instead from the mirror.openshift.com

cfergeau commented 1 year ago

I would initially just go with the possibility to use

* the 4.10.x bundle that crc version yy was built/tested against

Latest crc releases can fetch bundles from quay if needed, with gpg signature verification to replace the sha256sum verification we used to have

* the latest 4.10.x bundle available on quay regardless of which release it was tested against

This is what is still being discussed here and there in this issue https://github.com/code-ready/crc/issues/3206#issuecomment-1150700777 https://github.com/code-ready/crc/issues/3206#issuecomment-1154839207 ...

* a user-specified 4.10.x bundle available on quay, which allows to workaround possible issues with 'latest 4.10.x bundle' when it was not tested with crc version yy

This is https://github.com/code-ready/crc/issues/3277

cfergeau commented 1 year ago

Looking at the skopeo I can get the all the tags which a container image have on a specific registry but it doesn't provide which tags are point to same image hash so we might need to wire up something ourself https://docs.docker.com/registry/spec/api/#listing-image-tags is what registry API provides so we can querying each and every tag and sort them by the returned digest and then find it. I am putting here in case someone have better idea to handle it ( If possible then without the bundle versions ).

quay.io has its own API which provides more details:

$ curl https://quay.io/api/v1/repository/crcont/podman-bundle/tag/ | jq .
{
  "tags": [
    {
      "name": "2.7.1",
      "reversion": false,
      "start_ts": 1660622994,
      "manifest_digest": "sha256:95f37c68aff4ecc371e718f6a1eb962ede0c9f93ebba715180484ea06180886d",
      "is_manifest_list": true,
      "size": null,
      "last_modified": "Tue, 16 Aug 2022 04:09:54 -0000"
    },
    {
      "name": "2.7.0",
      "reversion": false,
      "start_ts": 1659947449,
      "manifest_digest": "sha256:95f37c68aff4ecc371e718f6a1eb962ede0c9f93ebba715180484ea06180886d",
      "is_manifest_list": true,
      "size": null,
      "last_modified": "Mon, 08 Aug 2022 08:30:49 -0000"
    },
...