common-workflow-language / cwltool

Common Workflow Language reference implementation
https://cwltool.readthedocs.io/
Apache License 2.0
324 stars 225 forks source link

Workflows with `DockerRequirement` fail when using Podman 5.0.2 #2001

Closed tom-tan closed 20 hours ago

tom-tan commented 1 month ago

I found this issue when trying the tutorial with Podman 5.0.2 that is bundled with the latest Podman Desktop. The reason why this issue happens is an incompatible behavior of --cidfile in podman was introduced between 3.4.4 (bundled in the repository of Ubuntu 22.04) and 5.0.2.

In the case of docker, docker run --cidfile foo.cid generates foo.cid and it remains after finishing the container instance. I guess podman 3.4.4 behaves same as docker. On the other hand, Podman 5.0.2 removes foo.cid after finishing the instance. Note that foo.cid is available during the container instance is running.

$ podman run --rm -it --cidfile foo.cid ubuntu sleep 1
$ ls foo.cid
ls: cannot access 'foo.cid': No such file or directory

When we execute podman without --rm, foo.cid remains as shown below:

$ podman run -it --cidfile foo.cid ubuntu sleep 1
$ ls foo.cid
foo.cid

A possible workaround is to add the --remote option to the podman execution as mentioned in the manual of podman-run. However, I am not sure it is a robust solution.

Expected Behavior

A workflow rna_seq_workflow_1.cwl succeeds as expected in the tutorial.

$ cwltool --podman rna_seq_workflow_1.cwl workflow_input_1.yml
...
Analysis complete for GSM461177_2_subsampled.fastqsanger
INFO [job quality_control] Max memory used: 179MiB
INFO [job quality_control] completed success
INFO [step quality_control] completed success
INFO [workflow ] completed success
{
    "quality_report": {
        "location": "file:///.../GSM461177_2_subsampled.fastqsanger_fastqc.html",
        "basename": "GSM461177_2_subsampled.fastqsanger_fastqc.html",
        "class": "File",
        "checksum": "sha1$e820c530b91a3087ae4c53a6f9fbd35ab069095c",
        "size": 378324,
        "path": "/.../GSM461177_2_subsampled.fastqsanger_fastqc.html"
    }
}
INFO Final process status is success

Actual Behavior

It fails with the following message:

$ cwltool --podman rna_seq_workflow_1.cwl workflow_input_1.yml
...
INFO [job quality_control] /private/tmp/docker_tmp5uqoak7j$ podman \
    run \
    -i \
    --userns=keep-id \
    --mount=type=bind,source=/private/tmp/docker_tmp5uqoak7j,target=/fNjnxO \
    --mount=type=bind,source=/private/tmp/docker_tmp8cxuhl3a,target=/tmp \
    --mount=type=bind,source=/Users/tanjo/repos/cwltutorial/novice-tutorial-exercises/rnaseq/GSM461177_2_subsampled.fastqsanger,target=/var/lib/cwl/stg5d2c2f8c-bddb-40f1-adff-1dab518cba4c/GSM461177_2_subsampled.fastqsanger,readonly \
    --workdir=/fNjnxO \
    --read-only=true \
    --user=501:20 \
    --rm \
    --cidfile=/private/tmp/docker_tmp2f4sckek/20240514172245-060800.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/fNjnxO \
    quay.io/biocontainers/fastqc:0.11.9--hdfd78af_1 \
    fastqc \
    --extract \
    --outdir \
    . \
    /var/lib/cwl/stg5d2c2f8c-bddb-40f1-adff-1dab518cba4c/GSM461177_2_subsampled.fastqsanger
...
ERROR 'podman' not found: [Errno 2] No such file or directory: '/private/tmp/docker_tmp2f4sckek/20240514172245-060800.cid'
...
INFO [workflow ] completed permanentFail
...
WARNING Final process status is permanentFail

Workflow Code

cwlVersion: v1.2
class: Workflow

inputs:
  rna_reads_fruitfly: File

steps:
  quality_control:
    run: bio-cwl-tools/fastqc/fastqc_2.cwl
    in:
      reads_file: rna_reads_fruitfly
    out: [html_file]

outputs:
  quality_report:
    type: File
    outputSource: quality_control/html_file

Your Environment

Server: Podman Engine Version: 5.0.0-dev-8a643c243 API Version: 5.0.0-dev-8a643c243 Go Version: go1.21.8 Built: Mon Mar 18 01:00:00 2024 OS/Arch: linux/arm64

tom-tan commented 1 month ago

Note: I doubt that this incompatible behavior of Podman will be "fixed" because the PR to clarify this behavior in the document was accepted.