Closed mambelli closed 9 months ago
To patch a 3.10.5 or 3.10.6 Factory, replace the content of /var/lib/gwms-factory/web-base/singularity_lib.sh
with https://raw.githubusercontent.com/mambelli/glideinwms/v310/i395_apptainer_test/creation/web_base/singularity_lib.sh and run a Factory upgrade.
This will allow to use apptainer/singularity as long as it returns exit code 0 and provides further output for troubleshooting.
I started PR #396 to work on the Issue. I'd like some further troubleshooting before merging a final solution, but the current content (patch above) will get you going.
I updated PR #396 to fix the Issue. It was not considering the case when uid_map had no initial blank. The link above points to the updated solution, with fixed validation and failure of the test when the output is wrong. You can use it for patching until the next release.
Not re-opening this just yet, but a T0 operator reported an issues with "paused jobs":
This are the job logfiles: http://mmascher.web.cern.ch/mmascher/Job_1080.tar.bz2
I will investigate more tomorrow morning
Thank you @mmascher
FYI: After some investigation the issue with the T0 jobs does not seem related to this patch
Describe the bug A hosted CE is failing to detect apptainer. From the detailed logs job.3610066.0.err.txt job.3610066.0.out.txt , apptainer runs correctly (EC 0), but fails the validation of the output. Possible causes are:
There could also be a problem with the image or Apptainer execution This behavior requires further investigation
To Reproduce Apptainer is in tmp/glide_Pb9H5L/main/singularity_setup.sh invoked w/ the command:
And the result is:
Expected behavior It is unexpected to have EC 0 and not the correct output
Screenshots
Info (please complete the following information):
Additional context