reanahub / reana-demo-cms-reco

REANA example - CMS reconstruction
MIT License
0 stars 23 forks source link

access run-time to condition database #4

Closed dprelipcean closed 5 years ago

dprelipcean commented 5 years ago

What is needed

I am dealing with the problem mentioned by @katilp in #2:

the re-reconstruction task needs access run-time to the condition database and as it is now, this is achieved with

ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA FT_53_LV5_AN1 ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db FT_53_LV5_AN1_RUNA.db

and clarified by @tiborsimko

Yes, the condition database on CVMFS can be accessed with any container, the only requirement is that the user should specify the necessary CVMFS volumes to be live-mounted in the reana.yaml resource section: https://reana.readthedocs.io/en/latest/userguide.html#declare-necessary-resources

From the files #3, I am using the reana resources as:

  resources:
    cvmfs:
      - cms-opendata-conddb.cern.ch

but still get the following error:

----- Begin Fatal Exception 09-Aug-2019 18:03:41 CEST-----------------------
An exception of category 'StdException' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing ESSource: class=PoolDBESSource label='GlobalTag'
Exception Message:
A std::exception was thrown.
Connection on "sqlite_file:/cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db" cannot be established ( CORAL : "ConnectionPool::getSessionFromNewConnection" from "CORAL/Services/ConnectionService" )
----- End Fatal Exception -------------------------------------------------

Two approaches

1. Local

Note: This is just a "workaround', as production will require the driver approach.

I've tried to mount cvmfs (already installed on the machine) directly on minikube, e.g.:

$ minikube mount /cvmfs:/cvmfs/

In this case, the reana.yaml file does not specify resources.

2. Using the driver

When specifying resources, I get a kubernetes PrivateVolumeClaim error. Here is the info: The pods of interest are those spawned for this workflow, namely:

$ kubectl get pods
NAME                                                   READY   STATUS    RESTARTS   AGE
batch-cwl-1b0133bb-c917-4495-96a6-52e728ff14dd-k5q6n   2/2     Running   0          49s
f086d097-5722-430e-a2e7-d3f248f7194b-nzpql             0/1     Pending   0          35s

The workflow pad correctly indicates that the cvmfs volume has to be mounted:

$ kubectl describe pod batch-cwl-1b0133bb-c917-4495-96a6-52e728ff14dd-k5q6n
    Environment:
      SHARED_VOLUME_PATH:                /var/reana
      REANA_USER_ID:                     00000000-0000-0000-0000-000000000000
      REANA_MOUNT_CVMFS:                 ['cms-opendata-conddb.cern.ch']
      JOB_CONTROLLER_SERVICE_PORT_HTTP:  5000
      JOB_CONTROLLER_SERVICE_HOST:       localhost

But then the next spawned pod is pending as there's an PersistentVolumeClaim:

$ kubectl describe pod f086d097-5722-430e-a2e7-d3f248f7194b-nzpql
Name:           f086d097-5722-430e-a2e7-d3f248f7194b-nzpql
Namespace:      default
Priority:       0
Node:           <none>
Labels:         controller-uid=e8d19725-195c-418c-9b7b-6606657b48f1
                job-name=f086d097-5722-430e-a2e7-d3f248f7194b
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  Job/f086d097-5722-430e-a2e7-d3f248f7194b
Containers:
  f086d097-5722-430e-a2e7-d3f248f7194b:
    Image:      cmsopendata/cmssw_5_3_32
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/sh
      -c
      umask 2;export PATH="/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin";export TMPDIR="/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd/cwl/tmpdir/L9j3Xw";export HOME="/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd/cwl/docker_outdir";mkdir -p /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd/cwl/docker_outdir && cd /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd/cwl/docker_outdir && /bin/zsh -c 'source /opt/cms/cmsset_default.sh ;
      scramv1 project CMSSW CMSSW_5_3_32 ;
      cd CMSSW_5_3_32/src ;
      eval `scramv1 runtime -sh` ;
      mkdir WorkDir && cd ./WorkDir ;
      git clone -b 2011 git://github.com/cms-legacydata-validation/RAWToAODValidation.git ;
      cd RAWToAODValidation && cd code/DoubleElectron ;
      scram b ;
      ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA FT_53_LV5_AN1; 
      ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db FT_53_LV5_AN1_RUNA.db; 
      ls -l; 
      ls -l /cvmfs/ ;
      cmsRun raw_DoubleElectron11.py' > /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd/cwl/docker_outdir/step1.log; cp -r /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd/cwl/docker_outdir/* /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd/cwl/outdir/vxgvPP
    Environment:  <none>
    Mounts:
      /cvmfs/cms-opendata-conddb.cern.ch from cms-opendata-conddb-cvmfs-volume (rw)
      /etc/reana/secrets from 00000000-0000-0000-0000-000000000000 (ro)
      /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd from reana-shared-volume (rw,path="users/00000000-0000-0000-0000-000000000000/workflows/1b0133bb-c917-4495-96a6-52e728ff14dd")
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qhv2v (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  00000000-0000-0000-0000-000000000000:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  00000000-0000-0000-0000-000000000000
    Optional:    false
  reana-shared-volume:
    Type:          HostPath (bare host directory volume)
    Path:          /var/reana
    HostPathType:  
  cms-opendata-conddb-cvmfs-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  csi-cvmfs-cms-opendata-conddb-pvc
    ReadOnly:   false
  default-token-qhv2v:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-qhv2v
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  51s (x5 over 4m57s)  default-scheduler  pod has unbound immediate PersistentVolumeClaims

This pending pod makes my workflow look as it's working to infinity and beyond:

reana-client status -w workflow.1
NAME       RUN_NUMBER   CREATED               STATUS    PROGRESS
workflow   1            2019-08-13T09:09:22   running   0/1    
katilp commented 5 years ago

There has been a "feature" with the cvmfs mount on containers that the directory is not found at the first try, but doing

ls -l ls -l /cvmfs/

Before the run ensures that it is available (http://opendata.cern.ch/docs/cms-guide-for-condition-database).

Maybe check if this is the case.

dprelipcean commented 5 years ago

These commands are already included. When running in the cms VM, my output is as expected:

$ ls -l
total 56240
-rw-r--r-- 1 cms-opendata cms-opendata      332 Aug  8 10:38 BuildFile.xml
lrwxrwxrwx 1 cms-opendata cms-opendata       53 Aug  8 10:38 FT_53_LV5_AN1 -> /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA
lrwxrwxrwx 1 cms-opendata cms-opendata       56 Aug 12 09:53 FT_53_LV5_AN1_RUNA.db -> /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db
-rw-r--r-- 1 cms-opendata cms-opendata     1159 Aug  8 10:38 analyzer_DoubleElectron11_rawreco.py
drwxr-xr-x 2 cms-opendata cms-opendata     4096 Aug  8 10:38 histos
drwxr-xr-x 2 cms-opendata cms-opendata     4096 Aug  8 10:38 python
-rw-r--r-- 1 cms-opendata cms-opendata     3276 Aug  8 10:38 raw_DoubleElectron11.py
-rw-r--r-- 1 cms-opendata cms-opendata 57561517 Aug  8 11:03 reco_DoubleElectron11_AOD.root
drwxr-xr-x 2 cms-opendata cms-opendata     4096 Aug  8 10:38 src
$ ls -l /cvmfs/
total 23
drwxr-xr-x  8 root root 4096 Jan 13  2014 cernvm-prod.cern.ch
drwxr-xr-x 12  989  984 4096 Jul 12  2016 cms-ib.cern.ch
drwxr-xr-x 12  989  984 4096 Dec 16  2015 cms-opendata-conddb.cern.ch
drwxr-xr-x 61  989  984 4096 Aug 29  2014 cms.cern.ch
drwxr-xr-x  3  989  984 4096 May 28  2014 cvmfs-config.cern.ch

But when running it with reana (or just in the docker image), what I get is:

$ ls -l
total 44
-rw-r--r-- 1 cmsusr cmsusr   331 Aug 12 09:46 BuildFile.xml
-rw-r--r-- 1 cmsusr cmsusr 16435 Aug 12 09:46 DoubleMu.root
lrwxrwxrwx 1 cmsusr cmsusr    53 Aug 12 09:47 FT_53_LV5_AN1 -> /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA
lrwxrwxrwx 1 cmsusr cmsusr    56 Aug 12 09:47 FT_53_LV5_AN1_RUNA.db -> /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db
-rw-r--r-- 1 cmsusr cmsusr  2560 Aug 12 09:46 README.md
drwxr-xr-x 2 cmsusr cmsusr  4096 Aug 12 09:46 datasets
-rw-r--r-- 1 cmsusr cmsusr  3589 Aug 12 09:46 demoanalyzer_cfg.py
drwxr-xr-x 2 cmsusr cmsusr  4096 Aug 12 09:46 python
drwxr-xr-x 2 cmsusr cmsusr  4096 Aug 12 09:46 src
$ ls -l /cvmfs/
total 0
clelange commented 5 years ago

If you just run the docker image, CVMFS is not mounted, and therefore there's only an empty directory. If you mount CVMFS via reana (in particular cms-opendata-conddb.cern.ch), you should see a directory structure though. From looking at your config it seems you do not mount CVMFS.

dprelipcean commented 5 years ago

From looking at your config it seems you do not mount CVMFS.

Not sure what do you mean here, as cms-opendata-conddb.cern.ch is specified in the resources of reana.yaml.

@diegodelemos When inspecting the kubeberntes pod I see that it should mount cvmfs, but I don't think it does so.

$ kubectl describe pod batch-cwl-d32746b5-e5ec-40dc-bc5d-1dca47aa1bef-t5qnx

    Environment:
      SHARED_VOLUME_PATH:                /var/reana
      REANA_USER_ID:                     00000000-0000-0000-0000-000000000000
      REANA_MOUNT_CVMFS:                 ['cms-opendata-conddb.cern.ch']
      JOB_CONTROLLER_SERVICE_PORT_HTTP:  5000
      JOB_CONTROLLER_SERVICE_HOST:       localhost
    Mounts:
      /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/d32746b5-e5ec-40dc-bc5d-1dca47aa1bef from reana-shared-volume (rw,path="users/00000000-0000-0000-0000-000000000000/workflows/d32746b5-e5ec-40dc-bc5d-1dca47aa1bef")
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-2szhg (ro)

Where should we look for more info?

dprelipcean commented 5 years ago

Update: In the config file of reana-commons, there was no cms-opendata-conddb.cern.ch.

dprelipcean commented 5 years ago

Has been fixed for local testing by this pr., closing atm.