Arcadia-Science / metagenomics

A Nextflow workflow for QC, evaluation, and profiling of metagenomic samples using short- and long-read technologies
MIT License
34 stars 2 forks source link

Terminal Freeze Issue: Arcadia Bioinformatics Pipeline Hangs at ARCADIASCIENCE_METAGENOMICS Process #63

Open ianvalenca opened 1 year ago

ianvalenca commented 1 year ago

Description of the bug

The pipeline appears to have a bug in the follow-up process. When executing the nextflow pipeline, the terminal behaves as if the process has stopped and overlaps the process workflow display, giving the appearance of a frozen state. The issue occurs when running the following command: nextflow run Arcadia-Science/metagenomics --input /home/filipe/Documents/Ian/Arcadia_metagenomics/Share_Cligest_data_2021_2022.csv --outdir /home/filipe/Documents/Jocelyne/2023_05_30_Angola_L1/fastq_pass2 --platform nanopore --sourmash_dbs /home/filipe/Documents/Ian/Sourmash/gtdb-rs214-reps.k31/SOURMASH-MANIFEST.csv --diamond_db /home/filipe/Documents/Ian/Blast/sequences.dmnd -profile docker -r main

The console output gets stuck on the following process:

[- ] process > ARCADIASCIENCE_METAGENOMICS... -

I am attaching a screenshot to this email for your reference. It's unclear to us why this is happening, and we haven't found a workaround yet.

Command used and terminal output

`nextflow run Arcadia-Science/metagenomics --input /home/filipe/Documents/Ian/Arcadia_metagenomics/Share_Cligest_data_2021_2022.csv --outdir /home/filipe/Documents/Jocelyne/2023_05_30_Angola_L1/fastq_pass2 --platform nanopore --sourmash_dbs /home/filipe/Documents/Ian/Sourmash/gtdb-rs214-reps.k31/SOURMASH-MANIFEST.csv --diamond_db /home/filipe/Documents/Ian/Blast/sequences.dmnd -profile docker -r main`

terminal output does not exist. It just freezes where the graphic representations of the process are generated.

Relevant files

Nextflow Workflow Report.pdf

System information

Nextflow version 23.04.0 Hardware Desktop Executor local Container engine: Docker OS Ubuntu Version of Arcadia-Science/hifi2genome: not sure

elizabethmcd commented 11 months ago

Hi Ian, On the first page of the nextflow workflow report PDF you attached, it says:

Command error:
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon
running?.
See 'docker run --help'.

And this error is coming up for the first process of checking the input CSV samplesheet, which requires docker. I would check to see if you have docker installed correctly according to your operating system and that it is running when you launch the workflow.

Let me know if that fixes the issue.

ianvalenca commented 11 months ago

Hi Elizabeth,

Thank you for your response. I'm currently running into an issue with the Arcadia-Science/metagenomics pipeline.

When running the pipeline, it initially appears to be running fine. However, at some point, it seems to skip some of the processing steps. Despite this, the pipeline continues to run until eventually failing with the following error:

vbnet:

ERROR ~ Error executing process > 'ARCADIASCIENCE_METAGENOMICS:NANOPORE:SOURMASH_PROFILE_READS:SOURMASH_GATHER (1)'

Caused by: Not a valid path value type: org.codehaus.groovy.runtime.NullObject (null)

I'm running the pipeline with the following command:

bash

nextflow run Arcadia-Science/metagenomics --input /home/filipe/Documents/Ian/Arcadia_metagenomics/Share_Cligest_data_2021_2022.csv --outdir /home/filipe/Documents/Jocelyne/2023_05_30_Angola_L1/fastq_pass2 --platform nanopore --sourmash_dbs /home/filipe/Documents/Ian/Sourmash/gtdb-rs214-reps.k31/SOURMASH-MANIFEST.csv --diamond_db /home/filipe/Documents/Ian/Blast/sequences.dmnd -profile docker -r main I have Docker correctly installed and running, and have been able to successfully run other Nextflow workflows with Docker.

image

Could you please assist me in resolving this issue?

elizabethmcd commented 11 months ago

Yes it looks like Docker is working now - could you check the SOURMASH-MANIFEST.csv file that you are providing for the sourmash databases that all the paths exist and the format is correct as shown here: https://github.com/Arcadia-Science/metagenomics/blob/main/docs/usage.md#sourmash-databasesv

elizabethmcd commented 9 months ago

Checking in on this issue - was this resolved for the sourmash gather databases?

ianvalenca commented 8 months ago

Hello Elizabeth! Sorry for the late reply!

Yes I was able to run the full pipeline, however I could only do it with the test datasets and only when use 4 barcodes samples from Nanopore sequencing. What would be ideal is that I could run batches of 12 and 24 barcodes at once. I tried to check if it was a memory issue on my side, but I couldn't figure it out yet. I will get back If I have more info on the issue. Thank you!