Open dcl10 opened 1 year ago
Hi @dcl10 . With the information you have provided, I have been digging in WfExS, and then in cwltool. I found a couple of issues related to cwltool
support of newer versions of podman
(see https://github.com/common-workflow-language/cwltool/issues/1884 and https://github.com/common-workflow-language/cwltool/issues/1883).
BTW, which version of podman are you using?
Hi @jmfernandez, sorry for the late reply. We have used podman
3.x which is installed by default with apt install
on ubuntu. We've also tried 4.x which has a slightly more complicated install that I can't remember. Either way, same result.
Hi again, past weeks I created a couple of pull requests to cwltool in order to fix their issues with podman, and both of them were accepted. Meanwhile cwltool release containing the fixes happens, latest commits on WfExS side are now installing a development version of cwltool when the workflow is instantiated.
Also, I have pushed changes to WfExS code related to podman containers management, so now a podman registry is located on each working directory, as well as compressed container images to restore it. Previous implementations used a shared podman registry located in the shared WfExS caching directory, which is a problem in case the cache is cleared or some file is tainted.
But the key part is that compressed container images in the working directory, along their metadata, are ruling the contents of the working directory podman registry. In case the working directory is transferred, due the way podman works, most of the files and directories of the unpacked images in the podman registry cannot be copied, due they are using other uids/gids. So, when a workflow is being run what it is now checked is the integrity of the working directory podman registry, in order to re-populate it using the compressed container images.
As a side note, I have also discovered that in Ubuntu 22.04, the installation of podman requires some tweaks, due "interferences" with systemd-homed (see https://wiki.archlinux.org/title/Podman#Set_subuid_and_subgid and https://github.com/systemd/systemd/issues/21952 )
Description
Using
stage
I can stage a workflow with podman. However running the workflow withstaged-workdir offline-exec
I get the following error:Fiddling with the code on a fork, I found adding
--no-container
or--user-space-docker-cmd
isn't compatible with--podman
.In
cwl_engine.py
I found that commenting out the--disable-pull
line seemed to fix the problem and the workflow runs as expected. However, I guess the--disable-pull
is there for a good reason. Could something be preventing WfExS from looking where the podman image is saved for the staged image?