Closed vyx-lucy-kaplun closed 1 year ago
Hi @vyx-lucy-kaplun! workflow-glue
is distributed with the workflow so should definitely be available. Are you able to run the following commands to help diagnose why it cannot be found?
grep 'export PATH' ~/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/b4/02196c40dce6923325b53300807246/.command.run
git --git-dir ~/.nextflow/assets/epi2me-labs/wf-human-variation/.git ls-files -s | grep bin
Here are the outputs of these commands:
1) export PATH="\$PATH:/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin"
2) 100755 7e25f4fe5a7ff99d7d2d0dcd762f3b46587e296c 0 bin/get_filter_calls_command.py 100755 25cd6754322a351236ec1e4603475e06eba9058c 0 bin/resolve_clair3_model.py 100755 1757c10f5de3f0016c28edeaa7d25ba332f15d43 0 bin/run_qdnaseq.r 100755 00185be043d245ce095fcadf1fcd0c2d69e1668c 0 bin/workflow-glue 100755 aa1ae1f4b2d67554a19af9f232dcfc17cb402083 0 bin/workflow_glue/init.py 100755 20704e4f9c7eccfc514c3b80e27b48db66b004db 0 bin/workflow_glue/check_sample_sheet.py 100755 027ecea1a7955ff1edcbce003ffd624dd3a86c06 0 bin/workflow_glue/check_sq_ref.py 100755 63b3d11e8021be897a9fb1713f432657ffb541bf 0 bin/workflow_glue/cnv_plot.py 100755 979b626f568bfe104410ddf4b8d68320895671d4 0 bin/workflow_glue/configure_jbrowse.py 100755 effe4eb6978344fb92459f84a0b7298277d96b82 0 bin/workflow_glue/fix_vcf.py 100755 ec14973e39b2e224d3d611b100a9b19e69e880cc 0 bin/workflow_glue/get_genome.py 100755 fd9fae4888fc0b547f9a29b99a0865cd64474a1d 0 bin/workflow_glue/report.py 100755 6f05ad8e0ba763f6dd1de8782f568f9969e1ce84 0 bin/workflow_glue/report_sv.py 100755 71bd80ff2c016a66ca048a46a44ec4a1428ccfa3 0 bin/workflow_glue/tests/init.py 100755 0530eaa74298839c1cf29e8d7c5e72f61b67ae01 0 bin/workflow_glue/tests/test_test.py 100755 9359ac708e96ba777c173c8b6c1458184e7416e0 0 bin/workflow_glue/util.py
Looks normal! Just to test...
python ~/.nextflow/assets/epi2me-labs/wf-human-variation/bin/workflow-glue
The result of this command is:
Traceback (most recent call last):
File "/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin/workflow-glue", line 7, in <module>
cli()
File "/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin/workflow_glue/__init__.py", line 51, in cli
f'{_package_name}.{comp}' for comp in get_components()]
File "/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin/workflow_glue/__init__.py", line 22, in get_components
mod = importlib.import_module(f"{_package_name}.{name}")
File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin/workflow_glue/cnv_plot.py", line 6, in <module>
from dominate.tags import h6, img, p, span, table, tbody, td, th, thead, tr
ModuleNotFoundError: No module named 'dominate'
@vyx-lucy-kaplun So far so good! The PATH inside the Nextflow job is set correctly, points to a directory that contains workflow-glue
, which exists and is executable (we can ignore the ModuleNotFound error as we're not using the container environment)... Can you try:
grep 'docker run' ~/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/b4/02196c40dce6923325b53300807246/.command.run
Here is the result:
docker run -i --cpus 1.0 -e "NXF_DEBUG=${NXF_DEBUG:=0}" -v /home/usr:/home/usr -w "$PWD" --user $(id -u):$(id -g) --group-add 100 --name $NXF_BOXID ontresearch/wf-human-variation:shaa6d218582d6056ea970b73e61f138ebb0ce6c5b1 /bin/bash -c "eval $(nxf_container_env); /bin/bash /home/usr/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/b4/02196c40dce6923325b53300807246/.command.run nxf_trace"
@vyx-lucy-kaplun Thanks for your patience, so far this all looks set up correctly! In lieu of debugging tools here, would you be willing to try the following and send the stdout?
# Change to workdir
cd /home/usr/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/b4/02196c40dce6923325b53300807246
pwd -P
echo $HOME
# Create a little debug script to echo the PATH and check your nextflow asset bin dir
# The quoting here is important and must not be changed
echo '#!/bin/bash -euo pipefail' > debug.sh
echo 'echo $PATH; echo $HOME; ls -lah '$HOME'/.nextflow/assets/epi2me-labs/wf-human-variation/bin' >> debug.sh
# Backup your existing .command.sh
cp .command.sh .command.sh.bak
# Create a new .command.sh by combining debug and your .command.sh.bak
cat debug.sh .command.sh.bak > .command.sh
# Attempt to run .command.run (which will use the new .command.sh)
bash .command.run
# Reset .command.sh
cp .command.sh.bak .command.sh
@SamStudio8
Thank you. Here is the output:
$ echo $HOME
/home/usr
$ bash .command.run
WARNING: IPv4 forwarding is disabled. Networking will not work.
/home/epi2melabs/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin
/home/epi2melabs
ls: cannot open directory '/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin': Permission denied
Thanks! Do you have the namei
command available?
namei -l /home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin
Here is the result:
f: /home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin
dr-xr-xr-x root root /
drwxr-xr-x root root home
drwx------ usr usr usr
drwxrwxr-x usr usr .nextflow
drwxrwxr-x usr usr assets
drwxrwxr-x usr usr epi2me-labs
drwxr-xr-x usr grp wf-human-variation
drwxr-xr-x usr grp bin
To sanity check, can you run our test script once more, but this time with namei
:
# Change to workdir
cd /home/usr/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/b4/02196c40dce6923325b53300807246
pwd -P
echo $HOME
# Create a little debug script to echo the PATH and check your HOME bin dir
# The quoting here is important and must not be changed
echo '#!/bin/bash -euo pipefail' > debug.sh
echo 'echo $PATH; echo $HOME; namei -l '$HOME'/.nextflow/assets/epi2me-labs/wf-human-variation/bin' >> debug.sh
# Backup your existing .command.sh
cp .command.sh .command.sh.bak
# Create a new .command.sh by combining debug and your .command.sh.bak
cat debug.sh .command.sh.bak > .command.sh
# Attempt to run .command.run (which will use the new .command.sh)
bash .command.run
# Reset .command.sh
cp .command.sh.bak .command.sh
$ bash .command.run
WARNING: IPv4 forwarding is disabled. Networking will not work.
/home/epi2melabs/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin
/home/epi2melabs
f: /home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin
drwxr-xr-x root root /
drwxr-xr-x root root home
drwx------ 12003 1000 usr
drwxrwxr-x 12003 1000 .nextflow
drwxrwxr-x 12003 1000 assets
drwxrwxr-x 12003 1000 epi2me-labs
drwxr-xr-x 12003 24006 wf-human-variation
drwxr-xr-x 12003 24006 bin
/home/usr/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/b4/02196c40dce6923325b53300807246/.command.sh: line 5: workflow-glue: command not found
My bad: I just realized that I have installed the pipeline in conda environment, so I tried the previous attempt (without namei) again, and got the same output as with namei
There does seem to be something wobbly about the permissions of the directories here. I would suggest deleting the version you have installed with nextflow drop epi2me-labs/wf-human-variation -f
and installing it again with nextflow pull epi2me-labs/wf-human-variation
and run the workflow again.
@SamStudio8 Unfortunately, still the same result:
$ OUTPUT=output
$ ./nextflow run epi2me-labs/wf-human-variation -w ${OUTPUT}/workspace -profile standard --snp --sv --bam demo_data/demo.bam --bed demo_data/demo.bed --ref demo_data/demo.fasta --basecaller_cfg 'dna_r10.4.1_e8.2_400bps_hac@v3.5.2' --sample_name MY_SAMPLE --out_dir ${OUTPUT}
Error executing process > 'bam_ingress:check_for_alignment (1)'
Caused by:
Process `bam_ingress:check_for_alignment (1)` terminated with an error exit status (1)
Command executed:
realign=0
workflow-glue check_sq_ref --xam demo.bam --ref demo.fasta || realign=$?
# Allow EX_OK and EX_DATAERR, otherwise explode
if [ $realign -ne 0 ] && [ $realign -ne 65 ]; then
exit 1
fi
Command exit status:
1
Command output:
(empty)
Command error:
WARNING: IPv4 forwarding is disabled. Networking will not work.
.command.sh: line 3: workflow-glue: command not found
Work dir:
/home/usr/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/5e/0d9261fb1eb7167746e5bf7c986ba2
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
Oh no! Can you run this once more...
# Change to workdir
cd /home/usr/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/5e/0d9261fb1eb7167746e5bf7c986ba2
pwd -P
echo $HOME
# Create a little debug script to echo the PATH and check your HOME bin dir
# The quoting here is important and must not be changed
echo '#!/bin/bash -euo pipefail' > debug.sh
echo 'echo $PATH; echo $HOME; id; namei -l '$HOME'/.nextflow/assets/epi2me-labs/wf-human-variation/bin' >> debug.sh
# Let's try and pinpoint exactly which dir!
echo 'ls -lah '$HOME'/ > /dev/null' >> debug.sh
echo 'ls -lah '$HOME'/.nextflow/ > /dev/null' >> debug.sh
echo 'ls -lah '$HOME'/.nextflow/assets > /dev/null' >> debug.sh
echo 'ls -lah '$HOME'/.nextflow/assets/epi2me-labs' >> debug.sh
echo 'ls -lah '$HOME'/.nextflow/assets/epi2me-labs/wf-human-variation' >> debug.sh
echo 'ls -lah '$HOME'/.nextflow/assets/epi2me-labs/wf-human-variation/bin' >> debug.sh
# Backup your existing .command.sh
cp .command.sh .command.sh.bak
# Create a new .command.sh by combining debug and your .command.sh.bak
cat debug.sh .command.sh.bak > .command.sh
# Attempt to run .command.run (which will use the new .command.sh)
bash .command.run
# Reset .command.sh
cp .command.sh.bak .command.sh
$ pwd -P
/home/usr/my-workspace/EPI2ME_WF_HUMAN_VARIANTION_jdk19/output/workspace/5e/0d9261fb1eb7167746e5bf7c986ba2
$ echo $HOME
/home/usr
$ bash .command.run
WARNING: IPv4 forwarding is disabled. Networking will not work.
/home/epi2melabs/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin
/home/epi2melabs
uid=12003 gid=24006 groups=24006,100(users)
f: /home/usr/.nextflow/assets/epi2me-labs/wf-human-variation/bin
drwxr-xr-x root root /
drwxr-xr-x root root home
drwx------ 12003 1000 usr
drwxrwxr-x 12003 1000 .nextflow
drwxrwxr-x 12003 1000 assets
drwxrwxr-x 12003 1000 epi2me-labs
drwxr-xr-x 12003 24006 wf-human-variation
drwxr-xr-x 12003 24006 bin
ls: cannot open directory '/home/usr/': Permission denied
@vyx-lucy-kaplun Thanks. Nextflow is mounting your home directory to the Docker container. We bind your user and group IDs to the container such that commands run inside are run as your user. Despite this, you're unable to list the contents of your own home directory. I should have questioned this earlier but I suspect SElinux might be at play! Do you know if SElinux is enabled on this machine? I believe it is by default on RHEL based distros.
Can you check the Security section of the output from docker info
?
If it is the case that Docker is running with SELinux, we need to pass an additional argument to label the bind mount, otherwise access through it may be blocked by SELinux (see https://docs.docker.com/storage/bind-mounts/#configure-the-selinux-label).
To set this up for Nextflow, create a file called mountflag.config
with the following contents:
docker.mountFlags = 'z'
Import this configuration by including -c mountflag.config
to your Nextflow command:
$ ./nextflow run epi2me-labs/wf-human-variation -c mountflag.config -w ${OUTPUT}/workspace -profile standard --snp --sv --bam demo_data/demo.bam --bed demo_data/demo.bed --ref demo_data/demo.fasta --basecaller_cfg 'dna_r10.4.1_e8.2_400bps_hac@v3.5.2' --sample_name MY_SAMPLE --out_dir ${OUTPUT}
@SamStudio8 Thank you! I will involve out local system administrator to make sure those modifications are allowed, and then try to implement it
Please let me know how you get on!
On Fri, 24 Feb 2023, 19:21 vyx-lucy-kaplun, @.***> wrote:
@SamStudio8 https://github.com/SamStudio8 Thank you! I will involve out local system administrator to make sure those modifications are allowed, and then try to implement it
— Reply to this email directly, view it on GitHub https://github.com/epi2me-labs/wf-human-variation/issues/21#issuecomment-1444306372, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIN6OUXLGWRH2CMJQ74KPDWZEC3PANCNFSM6AAAAAAVFFDJD4 . You are receiving this because you were mentioned.Message ID: @.***>
If it is the case that Docker is running with SELinux, we need to pass an additional argument to label the bind mount, otherwise access through it may be blocked by SELinux (see https://docs.docker.com/storage/bind-mounts/#configure-the-selinux-label).
To set this up for Nextflow, create a file called
mountflag.config
with the following contents:docker.mountFlags = 'z'
Import this configuration by including
-c mountflag.config
to your Nextflow command:$ ./nextflow run epi2me-labs/wf-human-variation -c mountflag.config -w ${OUTPUT}/workspace -profile standard --snp --sv --bam demo_data/demo.bam --bed demo_data/demo.bed --ref demo_data/demo.fasta --basecaller_cfg 'dna_r10.4.1_e8.2_400bps_hac@v3.5.2' --sample_name MY_SAMPLE --out_dir ${OUTPUT}
In our case, we are not using docker. How do we get the software working under SELinux in this case?
If it is the case that Docker is running with SELinux, we need to pass an additional argument to label the bind mount, otherwise access through it may be blocked by SELinux (see https://docs.docker.com/storage/bind-mounts/#configure-the-selinux-label). To set this up for Nextflow, create a file called
mountflag.config
with the following contents:docker.mountFlags = 'z'
Import this configuration by including
-c mountflag.config
to your Nextflow command:$ ./nextflow run epi2me-labs/wf-human-variation -c mountflag.config -w ${OUTPUT}/workspace -profile standard --snp --sv --bam demo_data/demo.bam --bed demo_data/demo.bed --ref demo_data/demo.fasta --basecaller_cfg 'dna_r10.4.1_e8.2_400bps_hac@v3.5.2' --sample_name MY_SAMPLE --out_dir ${OUTPUT}
In our case, we are not using docker. How do we get the software working under SELinux in this case?
Hi @vyx-daniel-karanja, the -profile standard
argument from the logs above is setting the container environment to Docker, and the Docker containers are running the commands that we have been experimenting with.
@vyx-daniel-karanja Did you mean that you are using Podman as a drop-in Docker replacement? The docker.mountFlags
above should still work in this case. Please advise if this is not the case!
@SamStudio8 , I do not see either running.
[root@
The containers run by Nextflow will be exiting and cleaned up almost immediately due to this permission error and may not run for long enough to appear in these commands. If you run 'docker events' in your terminal and leave it running while executing the Nextflow command in another terminal you should see the relevant container start and container destroy messages.
As mentioned the default profile will use Docker to run the Nextflow processes defined in the workflow that you are starting with the Nextflow command. If Docker could not be found, or the Dover engine was not running, you would get a different error from what we are observing here. I suggest trying to set docker.mountFlags as above to see if this overcomes your permission error.
On Mon, 27 Feb 2023, 19:43 vyx-daniel-karanja, @.***> wrote:
@SamStudio8 https://github.com/SamStudio8 , I do not see either running.
[root@~]# docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@ ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@ ~]# which podman /usr/bin/which: no podman in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) [root@ ~]#
— Reply to this email directly, view it on GitHub https://github.com/epi2me-labs/wf-human-variation/issues/21#issuecomment-1446961842, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIN6OQ5QCZ4PEYTE2CHDT3WZT7X3ANCNFSM6AAAAAAVFFDJD4 . You are receiving this because you were mentioned.Message ID: @.***>
ok. We shall try this.
With SELinux in permissive mode and while using the mount.flag option, here are the messages after kicking off the nextflow command. It is still running and I will update the ticket when it completes.
N E X T F L O W ~ version 22.10.7 Launching
https://github.com/epi2me-labs/wf-human-variation` [chaotic_boyd] DSL2 - revision: 18797fe1d2 [master]
Core Nextflow options
revision : master
runName : chaotic_boyd
containerEngine : docker
container : ontresearch/wf-human-variation:shaa6d218582d6056ea970b73e61f138ebb0ce6c5b1
launchDir : /home/
Workflow Options sv : true snp : true
Input Options bam : demo_data/demo.bam ref : demo_data/demo.fasta basecaller_cfg : dna_r10.4.1_e8.2_400bps_hac@v3.5.2 bed : demo_data/demo.bed
Output Options sample_name : MY_SAMPLE
Multiprocessing Options ubam_map_threads : 8 ubam_sort_threads : 3 ubam_bam2fq_threads: 1
Other parameters process_label : wfdefault
If you use epi2me-labs/wf-human-variation for your analysis please cite:
executor > local (7) [68/b971af] process > bam_ingress:check_for_alignment (1) [ 0%] 0 of 1 [- ] process > bam_ingress:minimap2_ubam - [df/30d663] process > cram_cache (1) [ 0%] 0 of 1 [- ] process > mosdepth - [- ] process > readStats - [82/04987c] process > lookup_clair3_model (1) [ 0%] 0 of 1 [- ] process > snp:make_chunks - [- ] process > snp:pileup_variants - [- ] process > snp:aggregate_pileup_variants - [- ] process > snp:select_het_snps - [- ] process > snp:phase_contig - [- ] process > snp:get_qual_filter - [- ] process > snp:create_candidates - [- ] process > snp:evaluate_candidates - [- ] process > snp:aggregate_full_align_variants - [- ] process > snp:merge_pileup_and_full_vars - [- ] process > snp:aggregate_all_variants - [66/ff0de0] process > snp:getVersions [ 0%] 0 of 1 [b0/241944] process > snp:getParams [ 0%] 0 of 1 [- ] process > snp:vcfStats - [- ] process > snp:makeReport - [- ] process > output_snp - [- ] process > sv:variantCall:filterBam - [- ] process > sv:variantCall:sniffles2 - [- ] process > sv:variantCall:filterCalls - [- ] process > sv:variantCall:sortVCF - [- ] process > sv:variantCall:indexVCF - [6f/af5ec3] process > sv:runReport:getVersions [ 0%] 0 of 1 [e2/e49276] process > sv:runReport:getParams [ 0%] 0 of 1 [- ] process > sv:runReport:report - [- ] process > output_sv - [- ] process > configure_jbrowse - [- ] process > publish_artifact
`
Hey @vyx-daniel-karanja, I hope your workflow finished OK? Did the mountFlag
option work without setting SELinux to permissive, or was that required in addition to the flags?
It did complete which is awesome. But it took far much longer than usual. Is there anything you can help to troubleshoot the slowness?
@vyx-daniel-karanja That's great to hear, the file mounting mystery is solved at least. If you'd like to post the execution timeline file in a new issue I might be able to help troubleshoot the slowness.
Completed at: 07-Mar-2023 18:12:15 Duration : 3h 26m 57s CPU hours : 3.9 Succeeded : 69
First file is with mountflag, second one is without Reported values: In the first file: window.data = { "elapsed": "2h 30m 23s", "beginningMillis": 1677855278818, "endingMillis": 1677864301499,
Please rename those two files to".html"
Hi @vyx-daniel-karanja, are both these runs local? Were they on the same machine? Is this reproducible across multiple runs with the same configuration?
Both of those runs were performed on one server and with the same data. We did not try that on multiple runs yet: it is just a testing in preparation for the the future and with new chemistry which is not currently in use.
Hi,
I am getting the same error workflow-glue: command not found
but in the context of singularity
(please see the detailed error below). From this post I understand that the mountFlag
option works in the context of docker
, is there a similar option for the singularity
context ?
Please let me know.
Detailed error msg --
ERROR ~ Error executing process > 'bam_ingress:check_for_alignment (1)'
Caused by:
Process bam_ingress:check_for_alignment (1)
terminated with an error exit status (1)
Command executed:
realign=0 workflow-glue check_sq_ref --xam demo.bam --ref demo.fasta || realign=$?
xam_sq_len=$(samtools view -H demo.bam | { grep -c '^@SQ' || [[ $? == 1 ]]; })
if [ $realign -ne 0 ] && [ $realign -ne 65 ]; then exit 1 fi
Command exit status: 1
Command output: (empty)
Command error: .command.sh: line 3: workflow-glue: command not found
What happened?
I am trying to run wf-human-variation with demo data using the exact demo settings and getting an error.
Error executing process > 'bam_ingress:check_for_alignment (1)'
Caused by: Process
bam_ingress:check_for_alignment (1)
terminated with an error exit status (1)Command executed:
realign=0 workflow-glue check_sq_ref --xam demo.bam --ref demo.fasta || realign=$?
Allow EX_OK and EX_DATAERR, otherwise explode
if [ $realign -ne 0 ] && [ $realign -ne 65 ]; then exit 1 fi
Command exit status: 1
Command output: (empty)
Command error: WARNING: IPv4 forwarding is disabled. Networking will not work. .command.sh: line 3: workflow-glue: command not found
Operating System
CentOS Linux
Workflow Execution
Command line
Workflow Execution - EPI2ME Labs Versions
No response
Workflow Execution - CLI Execution Profile
None
Workflow Version
wf-human-variation v1.2.0-g18797fe
Relevant log output