LSSTDESC / desc-gen3-prod

Desc-prod wrapper for pipeline production using gen3_workflow.
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Tasks are failing because g3wf-run-task is not found #7

Open dladams opened 11 months ago

dladams commented 11 months ago

The tasks in my shifter:16 jobs are failing this week with the error

/bin/bash: g3wf-run-task: command not found

Jobs last week did not have this problem.

dladams commented 11 months ago

I ran some g3wfpipe test jobs with config w2344-visit:250-pipe:ccd-init-proc. These howfigs fail (i.e. tasks have the above command not found error):

1272, 1273    shifter:16-wq:135-bproc:30-tmax:30m:29m
1280          shifter:16-wq:27-tmax:60m

and these are ok:

1274, 1275  shifter:16-tp:50-bproc:30-tmax:30m:29m
1276, 1277       cvmfs-wq:135-bproc:30-tmax:30m:29m
1278, 1279       cvmfs-tp:50-bproc:30-tmax:30m:29m
benclifford commented 11 months ago

is this coming from inside the shifter image? is it a different shifter image from when it last ran successfully for you?

benclifford commented 11 months ago

I'm seeing this also running using wq (not using shifter) with this sequence of jobs:

g3wfpipe w2348-visit:277-pipe:isr-init cvmfs-wq:50-tmax:29m
g3wfpipe job:1315-proc pmbs-30,tmax:29m g3wfpipe job:1315-proc tmax:29m

benclifford commented 11 months ago

I haven't investigated what changed since I last run stuff in descprod, which was quite a while ago. But I fixed in my environment like this:

diff --git a/scripts/runapp-g3wfpipe b/scripts/runapp-g3wfpipe
index 9415ab4..f77f3c2 100755
--- a/scripts/runapp-g3wfpipe
+++ b/scripts/runapp-g3wfpipe
@@ -595,8 +595,8 @@ if [[ -n $WFARGS ]]; then
   echo "  echo setup: Installing desc-gen3-prod." >>$RFIL
   echo "  pip install -t local/desc-gen3-prod git+https://github.com/LSSTDESC/desc-gen3-prod" >>$RFIL
   echo "fi" >>$RFIL
-  echo 'PATH=./local/desc-gen3-prod/bin:$PATH' >>$RFIL
-  echo 'PYTHONPATH=./local/desc-gen3-prod:$PYTHONPATH' >>$RFIL
+  echo 'PATH=$(pwd)/local/desc-gen3-prod/bin:$PATH' >>$RFIL
+  echo 'PYTHONPATH=$(pwd)/local/desc-gen3-prod:$PYTHONPATH' >>$RFIL
   echo 'echo setup: Running with desc-gen3-prod version $(desc-gen3-prod-version)' >>$RFIL
   echo 'echo setup: Location for g3wf-run-task: $(which g3wf-run-task)' >>$RFIL
   echo 'echo setup: Running with desc-wfmon version $(desc-wfmon-parsltest -v)' >>$RFIL

Using . in a $PATH is pretty fragile if anything is going to be running in an arbitrary directory later on, and so that made me a bit suspicious here...

I'm not sure if this is relevant to the shifter problem that this issue reports, but maybe there is something going on in the same direction as I experienced.