Open mapellidario opened 3 months ago
I believe that we agreed that @novicecpp will do the K8s part together with @aspiringmind-code
I am still un-decided if it is better to call it crabserver-canary or crabserver-qa
NOTE: it will be important to be able to tell (easily) the canary pod from the others in our monitoring, e.g. quickly tell where HTTP errors come from
Look promising. https://monit-grafana.cern.ch/goto/3JXSKT3Ig?orgId=11
EDIT: new image to contains pod name.
better split off TW+Publisher to a different issue #8678
To-do:
Sorry. I need to keep this open because it does not deploy on production yet.
test (at least test12) and preprod are now use new helm chart.
@aspiringmind-code will do it :-)
Sorry, wrong issue.
This one I tested myself on my test12 and preprod, and it works.
But you may need to modify dashboard a bit to make it more easier to see metrics between crabserver
and crabserver-canary
.
I knew that it works, but AFAIK it is not deployed in production, not used, not monitored. We have not even deployed latest CRABServer tag there. Plenty of useful work to do !
thanks @aspiringmind-code changes to TW are not trivial, let's create an ad-hoc issue
introduction
We have always dreamed about processing a small portion of production crab tasks with a new crab version.
We do not have a general solution, each crab system will need to be adapted in a very specific way.
REST
This requires no change in the code, but requires some experience with k8s deployments. The idea is:
dmwm/CMSKubernetes/helm/crabserver/templates
, call itcrabserver-canary
.metadata.name: crabserver-canary
, do not change for examplemetadata.labels.app: crabserver
norspec.selector.matchLabels.app: crabserver
dmwm/CMSKubernetes/helm/crabserver/values-canary.yaml
canary
as an argument, you will need to edit thecluster_map
.helm template crabserver . -f values.yaml -f values-${1}.yaml | kubectl -n crab apply -f -
, but beware to be connected with the proper context (username+cluster)!TaskWorker
This requires some change in the code, that have been outlined in https://github.com/dmwm/CRABServer/wiki/TaskWorker-Canary-Deployment
we decided that we run one TW per virtual machine, so that the hostname is enough to identify which TW process/container we are referring to.
Publisher
This is not defined yet, but we have some ideas. The simplest one is