dmwm / CRABServer

15 stars 38 forks source link

Move to GH the scripts used in jenkins for CI/CD #7085

Open mapellidario opened 2 years ago

mapellidario commented 2 years ago

description

CRAB team manages many jobs in Jenkins. In order to better keep track of the changes, we want to move the scripts executed by jenkins into dmwm/CRABServer.

(in progress) Move to GH the scripts for building rpms and docker images

When we create a new release in GH

  1. GH automatically sends a webhook to jenkins
  2. the webhook starts the job: github-webhook , whose script does not need to be moved to GH
  3. github-webhook starts CRABServer_BuildOnRelease
  4. CRABServer_BuildOnRelease starts CRABServer_BuildImage

current in-progress status (2022-02-15):

next actions,after we test that the build process is ok (i would like to see that with the next release, we properly build the image via the current setup)

Moreover, we changed a bit the taskworker Dockerfile, so we need to update the docker build commands in

And possible we can also create a new page describing the current CI pipeline

(todo) Move to GH the scripts for deploying CRAB services

We need to move to GH the scripts of the following jobs:

(todo) Move to GH the scripts for testing CRAB system

we need to move to GH the scripts in

maybe we can skip migrating the scripts from the old jobs:

belforte commented 2 years ago

maybe it is easier to remove old jobs, rename COPYCAT_* to "human name" and then import in GH the properly named scripts ?

belforte commented 2 years ago

assign to me to remind me to talk with Dario and Wa at proper time in order to find out if it is a good topic for Wa as part of his Jenkins training. But this can wait.

mapellidario commented 1 year ago

Since this issue was mentioned again today, I will post here a couple of questions and suggestions.

question 1

@belforte, could you paste here or somewhere else what you changed today in [1] ?

Looking at the jenkins diff i see only some minor things [2], nothing related to you private message "OK, it was a cc8 vs. el8 thing. Should be fixed now". I think I am missing something. thanks!

I fully agree that these kind of changes require better documentation. My suggestion is to move those bash scripts to GH and in the Jenkins configure page do a wget and execute the script.

question 2

Moreover, looking back at the code I have some doubts and i am not sure if it is past me who was a bit distracted or if it is because we both touched the same code. In particular [3] means that we are using singularity to submit jobs that require el8, not docker. I am 100% in favor of this approach, i would just remove the [ "X${singularity}" == X8 ] from the elif.

suggestion

Finally, now that i am thinking about this, why dont we use singularity for cc7 also?

I tried on my lxplus to submit a task with cmssw 10 from /cvmfs/cms.cern.ch/common/cmssw-cc7 and it seems to work [4], I don't see any reason why we should not. One less docker image to maintain, these singularity images are designed to run cmssw, while cmsweb docker feel like being designed around a "comp" environment.

Are we using docker images only because singularity images were not mature yet when you started automating tests, and only adopted them for cmssw 7 afterwards?


[1] https://cmssdt.cern.ch/dmwm-jenkins/job/CRABServer_ExecuteTests

[2] https://cmssdt.cern.ch/dmwm-jenkins/job/CRABServer_ExecuteTests/jobConfigHistory/showDiffFiles?timestamp1=2022-12-07_20-32-22&timestamp2=2023-01-27_14-05-02

[3] extract of the script from [1]:

#3. Submit tasks
if [ "X${singularity}" == X6 ] || [ "X${singularity}" == X8 ]; then
    echo "Starting singularity ${singularity} container."
[...]
    /cvmfs/cms.cern.ch/common/cmssw-${scramprefix} -- ./taskSubmission.sh || export ERR=true
elif [ "X${singularity}" == X7 ] || [ "X${singularity}" == X8 ] ; then
    echo "Starting CRAB testing container for slc${singularity}."
[...]
    docker run --rm $DOCKER_OPT $DOCKER_VOL $DOCKER_ENV --net=host \
    $Test_Docker_Image -c   \
    'source taskSubmission.sh' || export ERR=true
else 
    echo "!!! I am not prepared to run for slc${singularity}."
    exit 1
fi

[4] at the moment only 8 jobs finished out of 10, but i would it a success anyways. I extracted the necessary commands from the scripts in jenkins, i can write a couple of lines in our markdown docs if you want.

https://cmsweb-test11.cern.ch/crabserver/ui/task/230127_174022%3Admapelli_crab_20230127_184018

belforte commented 1 year ago

About question 2 and suggestion in previous comment:

The topic is "which environment to use to submit test tasks and which to check result". Note that information for e.g. crab status comes as pkl or json files, so it does not depend on CMSSW/python/OS version. For crab submit one needs to have the CMSSW environment for the specific application, since code from the framework is executed to parse the user's pset. But for status and many other command one can use any supported CMSSW version (we still need cmsenv in order to get all dependecies).

When setting things up with Daina a couple years ago, we had many long discussions about that, in a way many different ways to do things will work, some easily, some with a bit of effort, and it was not straightforward which to pick. Eventually we tried to stick to following guiding principles:

  1. when testing crab client the aim is to make sure that it keeps working for users when we change things, so we wanted to run in the "same environment as users". That means cmssw-sl6 for CMSSW_7 (there are no lxplus6 machines) but for the modern releases people work on lxplus+cmsenv, so we thought that an sl7 container (docker) plus manual addition of some rpms was the best way to stay close to real life. Surely decision was influenced by the fact that Daina had started with an slc7 container already, and everybody likes to minimize changes. Situation is a bit different now that we need a singularity container for sl6, another one for el8 so there's more pressure to be uniform.
  2. when checking task output, as I explained above, one could use any environment, this is not about validating specific crab commands, but only to verify that jobs ran successfully and procuded expected result, so it would be good to use a single setup for all CMSSW releases.

As you see, there is still freedom to review and modify. It is also possible that we missed something, that CMSSW singularity images improved (e.g. inclusion of gfal and myproxy, and that we simply made mistakes.

We should have that discussion in a meeting, and in case track in a different GH issue.

Let's keep this about "moving as much as we can from jenkins webpages to GH" !