Issue with running GPS pipeline

TNVN2022 commented 3 months ago

I met this issue just starting running the GPE pipeline as below: ERROR ~ Error executing process > 'SAVE_INFO:GET_VERSION:KRAKEN2_VERSION'

Caused by: Process SAVE_INFO:GET_VERSION:KRAKEN2_VERSION terminated with an error exit status (126)

Command executed:

VERSION=$(kraken2 -v | grep version | sed -r "s/.*\s(.+)/\1/")

Command exit status: 126

Command output: (empty)

Command error: docker: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create?name=nxf-W8NX0RjAPA0adEp9ZZ25NDok": dial unix /var/run/docker.sock: connect: permission denied. See 'docker run --help'.

Work dir: /home/ubuntu/gps-pipeline/work/5b/9b39a28529bd027e73a943240bc113

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

Can you please help me to solve this issue. Many thanks

HarryHung commented 3 months ago

Hi, it is a Docker error, please make sure you are running Docker when you try to run the pipeline.

If Docker is already running and you are not using a root account, please follow this guide: https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user

There are also some other potential solutions here: https://stackoverflow.com/questions/48957195/how-to-fix-docker-got-permission-denied-issue

P.S. To test if you have configured Docker successfully, you can just run docker run hello-world for quick testing.

TNVN2022 commented 3 months ago

I already run : sudo docker run hello-world and successfully worked. Then I followed the first link you recommended. After running sudo groupadd docker, it said:"groupadd: group 'docker' already exists". I keep add my user to dockergroup. Then running :"newgrp docker" required password. I tried with my login password but did not match.

HarryHung commented 3 months ago

If you have already run

sudo groupadd docker
sudo usermod -aG docker $USER

successfully

You can just restart your machine, instead of running newgrp docker

You should be able to run docker run hello-world without prefixing sudo

TNVN2022 commented 3 months ago

Yes, It works now. Thank you I am trying to test with 4 samples first , it is running.

If I have different folders containing raw reads. Each folder is data for each batch of sequencing. Can I run: ./

./run_pipeline --reads /path/to/raw-reads-directory1 /path/to/raw-reads-directory2 /path/to/raw-reads-directory3 --output result

Or can you suggest other way to run all batches data without move each data to one directory ?

Thank you Best regards Tam

From: Harry Hung @.> Sent: Monday, June 10, 2024 6:10 PM To: sanger-bentley-group/gps-pipeline @.> Cc: Tam Nguyen Thi @.>; Author @.> Subject: Re: [sanger-bentley-group/gps-pipeline] Issue with running GPS pipeline (Issue #108)

You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification

If you have already run

sudo groupadd docker
sudo usermod -aG docker $USER

successfully

You can just restart your machine, instead of running newgrp docker

You should be able to run docker run hello-world without prefixing sudo

— Reply to this email directly, view it on GitHubhttps://github.com/sanger-bentley-group/gps-pipeline/issues/108#issuecomment-2158052782, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZ6SQJR4LSGIAGAHHN3WIULZGWCRFAVCNFSM6AAAAABJCA2B3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGA2TENZYGI. You are receiving this because you authored the thread.Message ID: @.***>

HarryHung commented 3 months ago

Glad to know it is working!

At the moment, it only supports one directory at a time. I would suggest creating a directory that hold all symlinks (created via via ln -s) that are linked to reads in different directories.

TNVN2022 commented 3 months ago

Dear Harry,

I am running whole batches of about 200 samples, however it stopped due to lack of space on our server. So I move the gps-pipeline directory to new place, then I plan to re-run whole batch and store output in the same directory as I already run previously. Whether the pipeline re-assemble for samples already did or it will skip those samples and go next to other ones in the list? Should I use the same output directory or new output?

Thank you Best regards Tam

From: Harry Hung @.> Sent: Monday, June 10, 2024 6:28 PM To: sanger-bentley-group/gps-pipeline @.> Cc: Tam Nguyen Thi @.>; Author @.> Subject: Re: [sanger-bentley-group/gps-pipeline] Issue with running GPS pipeline (Issue #108)

You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification

Glad to know it is working!

At the moment, it only supports one directory at a time. I would suggest creating a directory that hold all symlinks (created via via ln -s) that are linked to reads in different directories.

— Reply to this email directly, view it on GitHubhttps://github.com/sanger-bentley-group/gps-pipeline/issues/108#issuecomment-2158087359, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZ6SQJXRKW362INNGZLSQ3LZGWEXVAVCNFSM6AAAAABJCA2B3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGA4DOMZVHE. You are receiving this because you authored the thread.Message ID: @.***>

HarryHung commented 3 months ago

@TNVN2022

As the pipeline is built upon Nextflow, you can use the built-in -resume function of Nextflow to resume an interrupted run as detailed here: https://github.com/sanger-bentley-group/gps-pipeline?tab=readme-ov-file#resume

However, I am not sure if this function would work across two different machines, but it will automactially deduce whether it is feasible and run what is necessary.

If you choose to use the -resume function, please keep the command the same (i.e. using same output directory as your original run).

TNVN2022 commented 3 months ago

Dear Harry,

After moving directory gps to new device, I run 254 samples with -resume function and it worked. However it took really long time to run and not complete yet after nearly 3 days. I started running on 11th June at 15:00 (GMT+7). Now it is at this stage as below: [45/3c8206] PIPELINE:HET_SNP_COUNT (B8-29_S29_L001) | 254 of 254 ✔ [71/5dcb7c] PIPELINE:MAPPING_QC (B8-29_S29_L001) | 254 of 254 ✔ [b8/123359] PIPELINE:TAXONOMY (B8-29_S29_L001) | 254 of 254 ✔ [db/4e749e] PIPELINE:TAXONOMY_QC (B8-29_S29_L001) | 254 of 254 ✔ [3f/a20e61] PIPELINE:OVERALL_QC (SPN-015_S36_L001) | 250 of 250 [- ] PIPELINE:LINEAGE - [c7/89e6f9] PIPELINE:SEROTYPE (B1_33_S33_L001) | 239 of 239 [d1/a5812e] PIPELINE:MLST (B1_33_S33_L001) | 239 of 239 [24/929c49] PIPELINE:PBP_RESISTANCE (B1_33_S33_L001) | 239 of 239 [01/613e05] PIPELINE:PARSE_PBP_RESISTANCE (B1_33_S33_L001) | 239 of 239 [79/07ba2b] PIPELINE:OTHER_RESISTANCE (B1_33_S33_L001) | 239 of 239 [5b/57fa3d] PIPELINE:PARSE_OTHER_RESISTANCE (B1_33_S33_L001) | 239 of 239 [- ] PIPELINE:GENERATE_SAMPLE_REPORT - [- ] PIPELINE:GENERATE_OVERALL_REPORT - [7b/0032bc] SAVE_INFO:GET_VERSION:IMAGES (1) | 1 of 1 ✔ [30/4c68ae] SAVE_INFO:GET_VERSION:DATABASES | 1 of 1 ✔ [2e/da465e] SAVE_INFO:GET_VERSION:PYTHON_VERSION | 1 of 1, cached: 1 ✔ [bd/2f283d] SAVE_INFO:GET_VERSION:FASTP_VERSION | 1 of 1 ✔ [17/306c2d] SAVE_INFO:GET_VERSION:UNICYCLER_VERSION | 1 of 1, cached: 1 ✔ [19/7ffa94] SAVE_INFO:GET_VERSION:SHOVILL_VERSION | 1 of 1, cached: 1 ✔ [90/948953] SAVE_INFO:GET_VERSION:QUAST_VERSION | 1 of 1, cached: 1 ✔ [20/b7c7e4] SAVE_INFO:GET_VERSION:BWA_VERSION | 1 of 1, cached: 1 ✔ [02/826ba6] SAVE_INFO:GET_VERSION:SAMTOOLS_VERSION | 1 of 1 ✔ [59/d645b0] SAVE_INFO:GET_VERSION:BCFTOOLS_VERSION | 1 of 1, cached: 1 ✔ [a2/f9b23b] SAVE_INFO:GET_VERSION:POPPUNK_VERSION | 1 of 1, cached: 1 ✔ [01/233195] SAVE_INFO:GET_VERSION:MLST_VERSION | 1 of 1 ✔ [f1/efe998] SAVE_INFO:GET_VERSION:KRAKEN2_VERSION | 1 of 1 ✔ [b4/536f39] SAVE_INFO:GET_VERSION:SEROBA_VERSION | 1 of 1, cached: 1 ✔ [06/4ba592] SAVE_INFO:GET_VERSION:ARIBA_VERSION | 1 of 1, cached: 1 ✔ [fa/6c8fad] SAVE_INFO:GET_VERSION:TOOLS | 1 of 1 ✔ [22/c47e8e] SAVE_INFO:GET_VERSION:COMBINE_INFO (1) | 1 of 1 ✔ [fa/9b6957] SAVE_INFO:PARSE (1) | 1 of 1 ✔ [4b/b5a93d] SAVE_INFO:SAVE (1) | 1 of 1 ✔

I checked the output: there is assembly folder and info.txt without report.tsv. I think there are some samples fail to assemble. Whether the pipeline still running? How long does it take to complete for 254 samples as normal?

Thank you Best regards Tam

From: Harry Hung @.> Sent: Tuesday, June 11, 2024 4:30 PM To: sanger-bentley-group/gps-pipeline @.> Cc: Tam Nguyen Thi @.>; Mention @.> Subject: Re: [sanger-bentley-group/gps-pipeline] Issue with running GPS pipeline (Issue #108)

You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification

@TNVN2022https://github.com/TNVN2022

As the pipeline is built upon Nextflow, you can use the built-in -resume function of Nextflow to resume an interrupted run as detailed here: https://github.com/sanger-bentley-group/gps-pipeline?tab=readme-ov-file#resume

However, I am not sure if this function would work across two different machines, but it will automactially deduce whether it is feasible and run what is necessary.

If you choose to use the -resume function, please keep the command the same (i.e. using same output directory as your original run).

— Reply to this email directly, view it on GitHubhttps://github.com/sanger-bentley-group/gps-pipeline/issues/108#issuecomment-2160245327, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZ6SQJQSIXMVQ2OR4Z7QJGDZG27TTAVCNFSM6AAAAABJCA2B3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRQGI2DKMZSG4. You are receiving this because you were mentioned.Message ID: @.***>

HarryHung commented 3 months ago

@TNVN2022

If no error message is printed, then it seems the pipeline is still running and it has not yet finished all the processes yet.

If some samples fail to assemble, they should be skipped by the pipeline and marked accordingly in the results.csv.

Please check the latest timestamps in the hidden file .nextflow.log in the pipeline directory to confirm whether its still running. If it is not, I would recommend run ./clean_pipeline and restart the pipeline from scratch.

Time to complete for 254 samples greatly depends on the computational power at your disposal. In our benchmark, it takes ~3 hours for 100 samples with a 16-core Ubuntu computer.

TNVN2022 commented 3 months ago

Dear Harry,

Attach is nextflow.log for this running. Can you please check whether it is still running.

Thank you Best regards Tam

From: Harry Hung @.> Sent: Friday, June 14, 2024 5:30 PM To: sanger-bentley-group/gps-pipeline @.> Cc: Tam Nguyen Thi @.>; Mention @.> Subject: Re: [sanger-bentley-group/gps-pipeline] Issue with running GPS pipeline (Issue #108)

@TNVN2022https://github.com/TNVN2022

If no error message is printed, then it seems the pipeline is still running and it has not yet finished all the processes yet.

If some samples fail to assemble, they should be skipped by the pipeline and marked accordingly in the results.csv.

Please check the latest timestamps in the hidden file .nextflow.log in the pipeline directory to confirm whether its still running. If it is not, I would recommend run ./clean_pipeline and restart the pipeline from scratch.

Time to complete for 254 samples greatly depends on the computational power at your disposal. In our benchmark, it takes ~3 hours for 100 samples with a 16-core Ubuntu computer.

— Reply to this email directly, view it on GitHubhttps://github.com/sanger-bentley-group/gps-pipeline/issues/108#issuecomment-2167734430, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZ6SQJXHXVXGFEWH2P6LUNDZHLA33AVCNFSM6AAAAABJCA2B3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRXG4ZTINBTGA. You are receiving this because you were mentioned.Message ID: @.***>

HarryHung commented 3 months ago

I cannot see your attachment. Can you please upload it via https://github.com/sanger-bentley-group/gps-pipeline/issues/108?

TNVN2022 commented 3 months ago

.nextflow.log

HarryHung commented 3 months ago

I think this is the log file from an older run?

The last entry is at Jun 11 07:44, talking about No space left on your storage device.

TNVN2022 commented 3 months ago

.nextflow.log Sorry, please check this. Thank you

HarryHung commented 3 months ago

It seems it has been stuck with the assembly of B7-06_S6_L001, CM28_S26_L001, B5-22_S22_L001, CM8_S45_L001 since June 11 night. Not sure what caused it.

You could either:

Send me the .command.log within their working directories (see below), and run docker stats to confirm the containers are still running:
- /data/reflab/gps-pipeline/work/6f/649e27ffbf62773d64dc660787207d
- /data/reflab/gps-pipeline/work/6c/50aa8037ff1f5ef592055a49958845
- /data/reflab/gps-pipeline/work/c1/06f97fc585f3a2bd6d30fc40bc5967
- /data/reflab/gps-pipeline/work/ce/e5a94ac8644cf6aed2aa0bbd07cc6f
Terminate the pipeline and resume it again.

TNVN2022 commented 3 months ago

.command.log

TNVN2022 commented 3 months ago

.command.log

TNVN2022 commented 3 months ago

.command.log

TNVN2022 commented 3 months ago

.command.log

HarryHung commented 3 months ago

It looks to me that shovill tried to run kmc within its container, but kmc never started running. I have not observed this before, can you try to kill the pipeline (the usual Ctrl + C) and resume it using -resume again?

TNVN2022 commented 3 months ago

ok I will try to resume. Tks

HarryHung commented 3 months ago

If it still happens, I read there is a possibility that the /tmp directory is full, please check that as well (e.g. via df -h /tmp) As in some Docker configuration, the /tmp within containers are mounted to the system /tmp. When the system /tmp is full, kmc could froze.

sanger-bentley-group / gps-pipeline

Issue with running GPS pipeline #108