dmwm / CRABServer

16 stars 38 forks source link

Canary deployments - REST, TW, Publisher #8650

Open mapellidario opened 3 months ago

mapellidario commented 3 months ago

introduction

We have always dreamed about processing a small portion of production crab tasks with a new crab version.

We do not have a general solution, each crab system will need to be adapted in a very specific way.

REST

This requires no change in the code, but requires some experience with k8s deployments. The idea is:

TaskWorker

This requires some change in the code, that have been outlined in https://github.com/dmwm/CRABServer/wiki/TaskWorker-Canary-Deployment

we decided that we run one TW per virtual machine, so that the hostname is enough to identify which TW process/container we are referring to.

Publisher

This is not defined yet, but we have some ideas. The simplest one is

belforte commented 3 months ago

I believe that we agreed that @novicecpp will do the K8s part together with @aspiringmind-code

I am still un-decided if it is better to call it crabserver-canary or crabserver-qa

belforte commented 3 months ago

NOTE: it will be important to be able to tell (easily) the canary pod from the others in our monitoring, e.g. quickly tell where HTTP errors come from

novicecpp commented 3 months ago

Screenshot from 2024-08-28 15-33-38

Look promising. https://monit-grafana.cern.ch/goto/3JXSKT3Ig?orgId=11

EDIT: new image to contains pod name.

belforte commented 3 months ago

better split off TW+Publisher to a different issue #8678

novicecpp commented 2 months ago

To-do:

novicecpp commented 4 weeks ago

Sorry. I need to keep this open because it does not deploy on production yet.

test (at least test12) and preprod are now use new helm chart.

belforte commented 4 weeks ago

@aspiringmind-code will do it :-)

novicecpp commented 4 weeks ago

Sorry, wrong issue.

This one I tested myself on my test12 and preprod, and it works. But you may need to modify dashboard a bit to make it more easier to see metrics between crabserver and crabserver-canary.

belforte commented 4 weeks ago

I knew that it works, but AFAIK it is not deployed in production, not used, not monitored. We have not even deployed latest CRABServer tag there. Plenty of useful work to do !

aspiringmind-code commented 1 week ago

For Canary TW deployment, I envision changes made in commits here and here @belforte let me know if you agree with this approach. Thanks!

belforte commented 1 week ago

thanks @aspiringmind-code changes to TW are not trivial, let's create an ad-hoc issue