Closed TNVN2022 closed 3 months ago
Hi, it is a Docker error, please make sure you are running Docker when you try to run the pipeline.
If Docker is already running and you are not using a root account, please follow this guide: https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user
There are also some other potential solutions here: https://stackoverflow.com/questions/48957195/how-to-fix-docker-got-permission-denied-issue
P.S. To test if you have configured Docker successfully, you can just run docker run hello-world
for quick testing.
I already run : sudo docker run hello-world and successfully worked. Then I followed the first link you recommended. After running sudo groupadd docker, it said:"groupadd: group 'docker' already exists". I keep add my user to dockergroup. Then running :"newgrp docker" required password. I tried with my login password but did not match.
If you have already run
sudo groupadd docker
sudo usermod -aG docker $USER
successfully
You can just restart your machine, instead of running newgrp docker
You should be able to run docker run hello-world
without prefixing sudo
Yes, It works now. Thank you I am trying to test with 4 samples first , it is running.
If I have different folders containing raw reads. Each folder is data for each batch of sequencing. Can I run: ./
./run_pipeline --reads /path/to/raw-reads-directory1 /path/to/raw-reads-directory2 /path/to/raw-reads-directory3 --output result
Or can you suggest other way to run all batches data without move each data to one directory ?
Thank you Best regards Tam
From: Harry Hung @.> Sent: Monday, June 10, 2024 6:10 PM To: sanger-bentley-group/gps-pipeline @.> Cc: Tam Nguyen Thi @.>; Author @.> Subject: Re: [sanger-bentley-group/gps-pipeline] Issue with running GPS pipeline (Issue #108)
You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
If you have already run
successfully
You can just restart your machine, instead of running newgrp docker
You should be able to run docker run hello-world without prefixing sudo
— Reply to this email directly, view it on GitHubhttps://github.com/sanger-bentley-group/gps-pipeline/issues/108#issuecomment-2158052782, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZ6SQJR4LSGIAGAHHN3WIULZGWCRFAVCNFSM6AAAAABJCA2B3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGA2TENZYGI. You are receiving this because you authored the thread.Message ID: @.***>
Glad to know it is working!
At the moment, it only supports one directory at a time.
I would suggest creating a directory that hold all symlinks (created via via ln -s
) that are linked to reads in different directories.
Dear Harry,
I am running whole batches of about 200 samples, however it stopped due to lack of space on our server. So I move the gps-pipeline directory to new place, then I plan to re-run whole batch and store output in the same directory as I already run previously. Whether the pipeline re-assemble for samples already did or it will skip those samples and go next to other ones in the list? Should I use the same output directory or new output?
Thank you Best regards Tam
From: Harry Hung @.> Sent: Monday, June 10, 2024 6:28 PM To: sanger-bentley-group/gps-pipeline @.> Cc: Tam Nguyen Thi @.>; Author @.> Subject: Re: [sanger-bentley-group/gps-pipeline] Issue with running GPS pipeline (Issue #108)
You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
Glad to know it is working!
At the moment, it only supports one directory at a time. I would suggest creating a directory that hold all symlinks (created via via ln -s) that are linked to reads in different directories.
— Reply to this email directly, view it on GitHubhttps://github.com/sanger-bentley-group/gps-pipeline/issues/108#issuecomment-2158087359, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZ6SQJXRKW362INNGZLSQ3LZGWEXVAVCNFSM6AAAAABJCA2B3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGA4DOMZVHE. You are receiving this because you authored the thread.Message ID: @.***>
@TNVN2022
As the pipeline is built upon Nextflow, you can use the built-in -resume
function of Nextflow to resume an interrupted run as detailed here: https://github.com/sanger-bentley-group/gps-pipeline?tab=readme-ov-file#resume
However, I am not sure if this function would work across two different machines, but it will automactially deduce whether it is feasible and run what is necessary.
If you choose to use the -resume
function, please keep the command the same (i.e. using same output directory as your original run).
Dear Harry,
After moving directory gps to new device, I run 254 samples with -resume function and it worked. However it took really long time to run and not complete yet after nearly 3 days. I started running on 11th June at 15:00 (GMT+7). Now it is at this stage as below: [45/3c8206] PIPELINE:HET_SNP_COUNT (B8-29_S29_L001) | 254 of 254 ✔ [71/5dcb7c] PIPELINE:MAPPING_QC (B8-29_S29_L001) | 254 of 254 ✔ [b8/123359] PIPELINE:TAXONOMY (B8-29_S29_L001) | 254 of 254 ✔ [db/4e749e] PIPELINE:TAXONOMY_QC (B8-29_S29_L001) | 254 of 254 ✔ [3f/a20e61] PIPELINE:OVERALL_QC (SPN-015_S36_L001) | 250 of 250 [- ] PIPELINE:LINEAGE - [c7/89e6f9] PIPELINE:SEROTYPE (B1_33_S33_L001) | 239 of 239 [d1/a5812e] PIPELINE:MLST (B1_33_S33_L001) | 239 of 239 [24/929c49] PIPELINE:PBP_RESISTANCE (B1_33_S33_L001) | 239 of 239 [01/613e05] PIPELINE:PARSE_PBP_RESISTANCE (B1_33_S33_L001) | 239 of 239 [79/07ba2b] PIPELINE:OTHER_RESISTANCE (B1_33_S33_L001) | 239 of 239 [5b/57fa3d] PIPELINE:PARSE_OTHER_RESISTANCE (B1_33_S33_L001) | 239 of 239 [- ] PIPELINE:GENERATE_SAMPLE_REPORT - [- ] PIPELINE:GENERATE_OVERALL_REPORT - [7b/0032bc] SAVE_INFO:GET_VERSION:IMAGES (1) | 1 of 1 ✔ [30/4c68ae] SAVE_INFO:GET_VERSION:DATABASES | 1 of 1 ✔ [2e/da465e] SAVE_INFO:GET_VERSION:PYTHON_VERSION | 1 of 1, cached: 1 ✔ [bd/2f283d] SAVE_INFO:GET_VERSION:FASTP_VERSION | 1 of 1 ✔ [17/306c2d] SAVE_INFO:GET_VERSION:UNICYCLER_VERSION | 1 of 1, cached: 1 ✔ [19/7ffa94] SAVE_INFO:GET_VERSION:SHOVILL_VERSION | 1 of 1, cached: 1 ✔ [90/948953] SAVE_INFO:GET_VERSION:QUAST_VERSION | 1 of 1, cached: 1 ✔ [20/b7c7e4] SAVE_INFO:GET_VERSION:BWA_VERSION | 1 of 1, cached: 1 ✔ [02/826ba6] SAVE_INFO:GET_VERSION:SAMTOOLS_VERSION | 1 of 1 ✔ [59/d645b0] SAVE_INFO:GET_VERSION:BCFTOOLS_VERSION | 1 of 1, cached: 1 ✔ [a2/f9b23b] SAVE_INFO:GET_VERSION:POPPUNK_VERSION | 1 of 1, cached: 1 ✔ [01/233195] SAVE_INFO:GET_VERSION:MLST_VERSION | 1 of 1 ✔ [f1/efe998] SAVE_INFO:GET_VERSION:KRAKEN2_VERSION | 1 of 1 ✔ [b4/536f39] SAVE_INFO:GET_VERSION:SEROBA_VERSION | 1 of 1, cached: 1 ✔ [06/4ba592] SAVE_INFO:GET_VERSION:ARIBA_VERSION | 1 of 1, cached: 1 ✔ [fa/6c8fad] SAVE_INFO:GET_VERSION:TOOLS | 1 of 1 ✔ [22/c47e8e] SAVE_INFO:GET_VERSION:COMBINE_INFO (1) | 1 of 1 ✔ [fa/9b6957] SAVE_INFO:PARSE (1) | 1 of 1 ✔ [4b/b5a93d] SAVE_INFO:SAVE (1) | 1 of 1 ✔
I checked the output: there is assembly folder and info.txt without report.tsv. I think there are some samples fail to assemble. Whether the pipeline still running? How long does it take to complete for 254 samples as normal?
Thank you Best regards Tam
From: Harry Hung @.> Sent: Tuesday, June 11, 2024 4:30 PM To: sanger-bentley-group/gps-pipeline @.> Cc: Tam Nguyen Thi @.>; Mention @.> Subject: Re: [sanger-bentley-group/gps-pipeline] Issue with running GPS pipeline (Issue #108)
You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
@TNVN2022https://github.com/TNVN2022
As the pipeline is built upon Nextflow, you can use the built-in -resume function of Nextflow to resume an interrupted run as detailed here: https://github.com/sanger-bentley-group/gps-pipeline?tab=readme-ov-file#resume
However, I am not sure if this function would work across two different machines, but it will automactially deduce whether it is feasible and run what is necessary.
If you choose to use the -resume function, please keep the command the same (i.e. using same output directory as your original run).
— Reply to this email directly, view it on GitHubhttps://github.com/sanger-bentley-group/gps-pipeline/issues/108#issuecomment-2160245327, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZ6SQJQSIXMVQ2OR4Z7QJGDZG27TTAVCNFSM6AAAAABJCA2B3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRQGI2DKMZSG4. You are receiving this because you were mentioned.Message ID: @.***>
@TNVN2022
If no error message is printed, then it seems the pipeline is still running and it has not yet finished all the processes yet.
If some samples fail to assemble, they should be skipped by the pipeline and marked accordingly in the results.csv
.
Please check the latest timestamps in the hidden file .nextflow.log
in the pipeline directory to confirm whether its still running. If it is not, I would recommend run ./clean_pipeline
and restart the pipeline from scratch.
Time to complete for 254 samples greatly depends on the computational power at your disposal. In our benchmark, it takes ~3 hours for 100 samples with a 16-core Ubuntu computer.
Dear Harry,
Attach is nextflow.log for this running. Can you please check whether it is still running.
Thank you Best regards Tam
From: Harry Hung @.> Sent: Friday, June 14, 2024 5:30 PM To: sanger-bentley-group/gps-pipeline @.> Cc: Tam Nguyen Thi @.>; Mention @.> Subject: Re: [sanger-bentley-group/gps-pipeline] Issue with running GPS pipeline (Issue #108)
@TNVN2022https://github.com/TNVN2022
If no error message is printed, then it seems the pipeline is still running and it has not yet finished all the processes yet.
If some samples fail to assemble, they should be skipped by the pipeline and marked accordingly in the results.csv.
Please check the latest timestamps in the hidden file .nextflow.log in the pipeline directory to confirm whether its still running. If it is not, I would recommend run ./clean_pipeline and restart the pipeline from scratch.
Time to complete for 254 samples greatly depends on the computational power at your disposal. In our benchmark, it takes ~3 hours for 100 samples with a 16-core Ubuntu computer.
— Reply to this email directly, view it on GitHubhttps://github.com/sanger-bentley-group/gps-pipeline/issues/108#issuecomment-2167734430, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZ6SQJXHXVXGFEWH2P6LUNDZHLA33AVCNFSM6AAAAABJCA2B3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRXG4ZTINBTGA. You are receiving this because you were mentioned.Message ID: @.***>
I cannot see your attachment. Can you please upload it via https://github.com/sanger-bentley-group/gps-pipeline/issues/108?
I think this is the log file from an older run?
The last entry is at Jun 11 07:44, talking about No space left on your storage device.
.nextflow.log Sorry, please check this. Thank you
It seems it has been stuck with the assembly of B7-06_S6_L001
, CM28_S26_L001
, B5-22_S22_L001
, CM8_S45_L001
since June 11 night. Not sure what caused it.
You could either:
.command.log
within their working directories (see below), and run docker stats
to confirm the containers are still running:
/data/reflab/gps-pipeline/work/6f/649e27ffbf62773d64dc660787207d
/data/reflab/gps-pipeline/work/6c/50aa8037ff1f5ef592055a49958845
/data/reflab/gps-pipeline/work/c1/06f97fc585f3a2bd6d30fc40bc5967
/data/reflab/gps-pipeline/work/ce/e5a94ac8644cf6aed2aa0bbd07cc6f
It looks to me that shovill
tried to run kmc
within its container, but kmc
never started running.
I have not observed this before, can you try to kill the pipeline (the usual Ctrl + C) and resume it using -resume
again?
ok I will try to resume. Tks
If it still happens, I read there is a possibility that the /tmp
directory is full, please check that as well (e.g. via df -h /tmp
)
As in some Docker configuration, the /tmp
within containers are mounted to the system /tmp
. When the system /tmp
is full, kmc
could froze.
I met this issue just starting running the GPE pipeline as below: ERROR ~ Error executing process > 'SAVE_INFO:GET_VERSION:KRAKEN2_VERSION'
Caused by: Process
SAVE_INFO:GET_VERSION:KRAKEN2_VERSION
terminated with an error exit status (126)Command executed:
VERSION=$(kraken2 -v | grep version | sed -r "s/.*\s(.+)/\1/")
Command exit status: 126
Command output: (empty)
Command error: docker: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create?name=nxf-W8NX0RjAPA0adEp9ZZ25NDok": dial unix /var/run/docker.sock: connect: permission denied. See 'docker run --help'.
Work dir: /home/ubuntu/gps-pipeline/work/5b/9b39a28529bd027e73a943240bc113
Tip: you can replicate the issue by changing to the process work dir and entering the command
bash .command.run
-- Check '.nextflow.log' file for details
Can you please help me to solve this issue. Many thanks