Open dladams opened 11 months ago
I ran some g3wfpipe test jobs with config w2344-visit:250-pipe:ccd-init-proc. These howfigs fail (i.e. tasks have the above command not found error):
1272, 1273 shifter:16-wq:135-bproc:30-tmax:30m:29m 1280 shifter:16-wq:27-tmax:60m
and these are ok:
1274, 1275 shifter:16-tp:50-bproc:30-tmax:30m:29m 1276, 1277 cvmfs-wq:135-bproc:30-tmax:30m:29m 1278, 1279 cvmfs-tp:50-bproc:30-tmax:30m:29m
is this coming from inside the shifter image? is it a different shifter image from when it last ran successfully for you?
I'm seeing this also running using wq (not using shifter) with this sequence of jobs:
g3wfpipe w2348-visit:277-pipe:isr-init cvmfs-wq:50-tmax:29m
g3wfpipe job:1315-proc pmbs-30,tmax:29m
g3wfpipe job:1315-proc tmax:29m
I haven't investigated what changed since I last run stuff in descprod, which was quite a while ago. But I fixed in my environment like this:
diff --git a/scripts/runapp-g3wfpipe b/scripts/runapp-g3wfpipe
index 9415ab4..f77f3c2 100755
--- a/scripts/runapp-g3wfpipe
+++ b/scripts/runapp-g3wfpipe
@@ -595,8 +595,8 @@ if [[ -n $WFARGS ]]; then
echo " echo setup: Installing desc-gen3-prod." >>$RFIL
echo " pip install -t local/desc-gen3-prod git+https://github.com/LSSTDESC/desc-gen3-prod" >>$RFIL
echo "fi" >>$RFIL
- echo 'PATH=./local/desc-gen3-prod/bin:$PATH' >>$RFIL
- echo 'PYTHONPATH=./local/desc-gen3-prod:$PYTHONPATH' >>$RFIL
+ echo 'PATH=$(pwd)/local/desc-gen3-prod/bin:$PATH' >>$RFIL
+ echo 'PYTHONPATH=$(pwd)/local/desc-gen3-prod:$PYTHONPATH' >>$RFIL
echo 'echo setup: Running with desc-gen3-prod version $(desc-gen3-prod-version)' >>$RFIL
echo 'echo setup: Location for g3wf-run-task: $(which g3wf-run-task)' >>$RFIL
echo 'echo setup: Running with desc-wfmon version $(desc-wfmon-parsltest -v)' >>$RFIL
Using .
in a $PATH is pretty fragile if anything is going to be running in an arbitrary directory later on, and so that made me a bit suspicious here...
I'm not sure if this is relevant to the shifter problem that this issue reports, but maybe there is something going on in the same direction as I experienced.
The tasks in my shifter:16 jobs are failing this week with the error
Jobs last week did not have this problem.