PacificBiosciences / HiFi-human-WGS-WDL

BSD 3-Clause Clear License
52 stars 30 forks source link

extract_read_length_and_qual.py: command not found #141

Closed Mo7ammedFarahat closed 3 months ago

Mo7ammedFarahat commented 4 months ago

Hello, i got this error.

I couln't find this py script anywhere.

/cromwell-executions/sample_analysis/02a07f7f-21a2-4a5d-adc8-525abbc82a7c/call-pbmm2_align/shard-0/execution/script: line 41: extract_read_length_and_qual.py: command not found

williamrowell commented 4 months ago

I suspect that you have a version mismatch either between the repo and the wdl-common submodule or between the repo and the Docker containers. In v1. of this workflow, this script is within the pbmm2 container. In the v2. prerelease (unsupported at the moment) of this workflow, the script is embedded directly within the task as a heredoc.

1) When you pulled the code from GitHub, did you pull the matching version of the wdl-common submodule, either with git submodule update --init or the --recursive flag as show below?

```bash
git clone \
  --depth 1 --branch v1.1.1 \
  --recursive \
  https://github.com/PacificBiosciences/HiFi-human-WGS-WDL.git
```

2) Did you use our Docker containers on Quay.io, or did you build them yourself? Did you modify any of the sha256 tags for the Docker containers within the WDL code, or are you using the WDL code exactly as provided?

Mo7ammedFarahat commented 3 months ago

Im using the containers specified in WDL, I got that error:

Docker image /pbmm2@sha256:1013aa0fd5fb42c607d78bfe3ec3d19e7781ad3aa337bf84d144c61ed7d51fa1 has an invalid syntax.

Then i changed that tag to another one from quay.io to the latest build is that one:

pbmm2@sha256:265eef770980d93b849d1ddb4a61ac449f15d96981054e91d29da89943084e0e

and it could pull it successfully and stuck in python script

Mo7ammedFarahat commented 3 months ago

Here is the error i got when i used the WDL without any modifications:

[2024-07-30 13:15:08,38] [warn] BackendPreparationActor_for_fce9cb33:sample_analysis.pbmm2_align:0:1 [fce9cb33]: Docker lookup failed java.lang.Exception: Failed to get docker hash for quay.io/pacbio/pbmm2@sha256:1013aa0fd5fb42c607d78bfe3ec3d19e7781ad3aa337bf84d144c61ed7d51fa1 Error connecting to https://quay.io using address quay.io:443 (unresolved: false)

Note that in input.json i specified that:

"container_registry": "quay.io/pacbio"

However, when I looked in stdout.submit I have found that:

Image /scratch3/users/mohammedfarahat/.singularity/cache/quay.io_pacbio_pbmm2@sha256:1013aa0fd5fb42c607d78bfe3ec3d19e7781ad3aa337bf84d144c61ed7d51fa1.sif already exists. Yay Submitted batch job 9901877

williamrowell commented 3 months ago

Im using the containers specified in WDL, I got that error:

Docker image /pbmm2@sha256:1013aa0fd5fb42c607d78bfe3ec3d19e7781ad3aa337bf84d144c61ed7d51fa1 has an invalid syntax.

This is because container_registry was set to an empty string "". In json, an empty string is not the same as null.

Then i changed that tag to another one from quay.io to the latest build is that one:

pbmm2@sha256:265eef770980d93b849d1ddb4a61ac449f15d96981054e91d29da89943084e0e

and it could pull it successfully and stuck in python script

This is what caused your original problem. By modifying the sha256 tag for this docker image in the WDL file, you've introduced a version mismatch. The original tag pointed to an image with this python script. The new tag points to an image that does not. Revert to the original pbmm2 image sha256 and this error will go away.

Here is the error i got when i used the WDL without any modifications:

[2024-07-30 13:15:08,38] [�[38;5;220mwarn�[0m] BackendPreparationActor_for_fce9cb33:sample_analysis.pbmm2_align:0:1 [�[38;5;2mfce9cb33�[0m]: Docker lookup failed java.lang.Exception: Failed to get docker hash for quay.io/pacbio/pbmm2@sha256:1013aa0fd5fb42c607d78bfe3ec3d19e7781ad3aa337bf84d144c61ed7d51fa1 Error connecting to https://quay.io using address quay.io:443 (unresolved: false)

This is a different problem altogether, and it's more of a warning (non-fatal) than an error. This is a log message for Cromwell checking the docker repo for the hash and size of the pbmm2 container image. This information is used by Cromwell for call caching. I suspect that the compute nodes on your cluster contact the internet using a proxy. Can you confirm this with your sysadmins?

If this is the case, then this is a known issue with Cromwell. Cromwell will not use proxies for Docker hash lookups, and the Cromwell dev team has no plans to support this (https://github.com/broadinstitute/cromwell/pull/7114#issuecomment-1545803504).

One solution is setting up an internal container repo server that Cromwell can access without a proxy. Another solution would be using miniwdl instead of Docker.

Note that in input.json i specified that:

"container_registry": "quay.io/pacbio"

However, when I looked in stdout.submit I have found that:

Image /scratch3/users/mohammedfarahat/.singularity/cache/quay.io_pacbio_pbmm2@sha256:1013aa0fd5fb42c607d78bfe3ec3d19e7781ad3aa337bf84d144c61ed7d51fa1.sif already exists. Yay Submitted batch job 9901877

Yes, so this is showing that you already have this image cached and available. The Failed to get docker hash error above is likely due to the inability of Cromwell to use proxies, as discussed above.

These issues are more about cromwell/miniwdl setup than bugs/features of the workflow. Can you email support@pacb.com and we'll try to help you get set up. Please mention your HPC job scheduler in the email.