NearNodeFlash / NearNodeFlash.github.io

View this document https://nearnodeflash.github.io/
Apache License 2.0
3 stars 3 forks source link

Need way to know release versions #106

Open bdevcich opened 9 months ago

bdevcich commented 9 months ago

Currently, our software is released via nnf-deploy, but there is no straightforward way to understand what version(s) are deployed on the system.

We really only have image tags for each pod to understand which versions are deployed.

Come up with a way to easily divulge this information to an admin. Some initial thoughts are to stuff it in the systemconfiguration, but perhaps a new resource for versioning is better (e.g. SystemVersion).

bdevcich commented 9 months ago

Yesterday, when we looked at the image tags for dws on an LLNL system, it was a hash and not a git tag (e.g. 0.0.9) that would we expect.

In our case, when we deploy releases we're using nnf-deploy which has all the other components as submodules. Those submodules point to each individual components' git tag. Those tags are then fed into our Makefiles which place the image tags (i.e. versions) in the proper places to make it to the image tags for the containers.

➜ nnf-deploy git:(v0.0.4) git submodule
 423255e4357dc7109e967ca6e69c9a1a47edc635 dws (v0.0.12)
 6c57f75a013677ef4b599527e928bd6a2e5daeda lustre-csi-driver (v0.0.6)
 83b70fca6e0db1799e82bde5afe17c6d3a47e9fb lustre-fs-operator (v0.0.5)
 6e79eba3e92c49e3b99eb705e0494d0df0ca306b nnf-dm (v0.0.5)
 7b831836b877f1e7e9c9cfe93feb8754f1f31d30 nnf-integration-test (v0.0.2)
 2af650d314ba8d58d6c33da850e921a770195d59 nnf-sos (v0.0.5)
➜ kubectl get pod -n nnf-system nnf-controller-manager-9b8444b84-4jlr9 -oyaml | yq '.spec.containers[].image'
ghcr.io/nearnodeflash/nnf-sos:0.0.5
gcr.io/kubebuilder/kube-rbac-proxy:v0.13.0

@behlendorf can you describe your process for building and deploying the containers on your systems? We need to understand that before we can figure out a solution. We want to make sure we avoid using hashes as the image tags for released versions.

We need to make sure we understand your build process so we can make sure the tags make it to the components themselves before we can place those versions somewhere that is easier to retrieve than querying pods.

bdevcich commented 9 months ago

@roehrich-hpe

behlendorf commented 9 months ago

@bdevcich we are using the nnf-deploy utility to build and deploy the containers. Here's what the process looks when we re-deploy the latest release.

➜ export NNF_VERSION=v0.0.4

➜ git clone --recurse-submodules git@github.com:NearNodeFlash/nnf-deploy nnf-deploy-$NNF_VERSION
➜ cd nnf-deploy-$NNF_VERSION
➜ git checkout $NNF_VERSION
➜ git submodule update

➜ git submodule status
423255e4357dc7109e967ca6e69c9a1a47edc635 dws (v0.0.12)
6c57f75a013677ef4b599527e928bd6a2e5daeda lustre-csi-driver (v0.0.6)
83b70fca6e0db1799e82bde5afe17c6d3a47e9fb lustre-fs-operator (v0.0.5)
6e79eba3e92c49e3b99eb705e0494d0df0ca306b nnf-dm (v0.0.5)
7b831836b877f1e7e9c9cfe93feb8754f1f31d30 nnf-integration-test (v0.0.2)
2af650d314ba8d58d6c33da850e921a770195d59 nnf-sos (v0.0.5)

➜ go build
➜ ./nnf-deploy deploy

In practice we also may still need to do some minor customization (recreate the lustre filesystems and profiles, etc). Some of that's been improved but we haven't gone through the process that many times yet with more recent code.

We're definitely interested in better deployment tools, particularly those which will help us with customization. Each of our deployments with be slightly different and we need to be able to manage that. @roehrich-hpe may have some ideas here.