RVanDamme / MUFFIN

hybrid assembly and differential binning workflow for metagenomics, transcriptomics and pathway analysis
https://rvandamme.github.io/MUFFIN_Documentation/#introduction
GNU General Public License v3.0
65 stars 11 forks source link

MUFFIN only works with NXF_VER=20.10.0 and will not proceed after Medaka step #25

Closed marade closed 2 years ago

marade commented 2 years ago

There are no errors associated with both problems. Trying to use a Nextflow version other than 20.10.0, the pipeline does nothing. This is undocumented. I haven't found a a way to get the pipeline to go beyond the Medaka step. It stops with no errors:

[skipped ] process > sourmash_download_db [100%] 1 of 1, stored: 1 ✔ [skipped ] process > checkm_download_db [100%] 1 of 1, stored: 1 ✔ [99/42d25d] process > checkm_setup_db [100%] 1 of 1 ✔ [8a/b445c8] process > discard_short (54) [100%] 54 of 54 ✔ [d4/ccef04] process > filtlong (54) [100%] 54 of 54 ✔ [30/26215b] process > merge (1) [100%] 1 of 1 ✔ [7f/3135b9] process > fastp (1) [100%] 1 of 1 ✔ [39/df06de] process > flye (1) [100%] 1 of 1 ✔ [60/ba77a0] process > minimap_polish (1) [100%] 1 of 1 ✔ [f6/55dd42] process > racon (1) [100%] 1 of 1 ✔ [b0/e0e76c] process > medaka (1) [100%] 1 of 1 ✔ [- ] process > pilon - [- ] process > minimap2 - [- ] process > bwa - [- ] process > metabat2 - [- ] process > maxbin2 - [- ] process > concoct - [- ] process > refine3 - [- ] process > checkm - [- ] process > sourmash_bins - [- ] process > sourmash_checkm_parser - [skipped ] process > eggnog_download_db [100%] 1 of 1, stored: 1 ✔ [- ] process > eggnog_bin - [- ] process > parser_bin - [47/d71dba] process > readme_output [100%] 1 of 1 ✔ Done! Results are stored here --> results The Readme file in results describe the structure of the results directories. Completed at: 20-Jan-2022 02:04:22 Duration : 6h 33m 41s CPU hours : 291.2 Succeeded : 116

replikation commented 2 years ago

thanks for reporting this we look into it.

dzolier commented 2 years ago

I have a similar issue, except my run refuses to continue past the fastp & merge steps

The command:

NXF_VER=20.10.0 nextflow -log /home/ubuntu/MUPHIN_SPAdes_3/MUPHIN_SPAdes_3.log run RVanDamme/MUFFIN/main.nf --output /home/ubuntu/MUPHIN_SPAdes_3/ --ont /home/ubuntu/Guppy_concatenated/ --illumina /home/ubuntu/kneadeddata/ --assembler metaspades --cpus 120 --memory 490g --skip_maxbin2 --polish_iteration 2 --check_db /home/ubuntu/nextflow-autodownload-databases/checkm/db/ --eggnog_db /home/ubuntu/nextflow-autodownload-databases/eggnogdb_5.0.1 --sourmash_db /home/ubuntu/nextflow-autodownload-databases/sourmash -profile local,conda

The reply:

N E X T F L O W  ~  version 20.10.0
Launching `RVanDamme/MUFFIN` [stupefied_galileo] - revision: 3695f30cc3 [master]
[skipped  ] process > checkm_download_db     [100%] 1 of 1, stored: 1
[-        ] process > checkm_setup_db        -
[-        ] process > discard_short          -
[-        ] process > merge                  -
[-        ] process > fastp                  -
[-        ] process > spades                 -
executor >  local (1)
[skipped  ] process > checkm_download_db     [100%] 1 of 1, stored: 1 ✔
[-        ] process > checkm_setup_db        -
[-        ] process > discard_short          -
[-        ] process > merge                  -
[-        ] process > fastp                  -
[-        ] process > spades                 -
executor >  local (2)
[skipped  ] process > checkm_download_db     [100%] 1 of 1, stored: 1 ✔
[-        ] process > checkm_setup_db        -
[cc/8272e5] process > discard_short (1)      [  0%] 0 of 1
[-        ] process > merge                  -
[-        ] process > fastp                  -
[-        ] process > spades                 -
executor >  local (2)
[skipped  ] process > checkm_download_db     [100%] 1 of 1, stored: 1 ✔
[-        ] process > checkm_setup_db        -
[cc/8272e5] process > discard_short (1)      [  0%] 0 of 1
[-        ] process > merge                  -
[-        ] process > fastp                  -
[-        ] process > spades                 -
executor >  local (3)
[skipped  ] process > checkm_download_db     [100%] 1 of 1, stored: 1 ✔
[-        ] process > checkm_setup_db        -
[cc/8272e5] process > discard_short (1)      [100%] 1 of 1 ✔
[81/8f8f54] process > merge (1)              [  0%] 0 of 1
[-        ] process > fastp                  -
[-        ] process > spades                 -
executor >  local (3)
[skipped  ] process > checkm_download_db     [100%] 1 of 1, stored: 1 ✔
[-        ] process > checkm_setup_db        -
[cc/8272e5] process > discard_short (1)      [100%] 1 of 1 ✔
[81/8f8f54] process > merge (1)              [100%] 1 of 1 ✔
[-        ] process > fastp                  -
[-        ] process > spades                 -
executor >  local (5)
[skipped  ] process > checkm_download_db     [100%] 1 of 1, stored: 1 ✔
[80/9bde2e] process > checkm_setup_db        [100%] 1 of 1 ✔
[cc/8272e5] process > discard_short (1)      [100%] 1 of 1 ✔
[81/8f8f54] process > merge (1)              [100%] 1 of 1 ✔
[04/c74471] process > fastp (1)              [100%] 1 of 1 ✔
[-        ] process > spades                 -
[-        ] process > minimap2               -
[-        ] process > bwa                    -
[-        ] process > metabat2               -
[-        ] process > concoct                -
[-        ] process > refine2                -
[-        ] process > checkm                 -
[-        ] process > sourmash_bins          -
[-        ] process > sourmash_checkm_parser -
[-        ] process > eggnog_bin             -
[-        ] process > parser_bin             -
[f7/5e2d65] process > readme_output          [100%] 1 of 1 ✔
Done! Results are stored here --> /home/ubuntu/MUPHIN_SPAdes_3/
 The Readme file in /home/ubuntu/MUPHIN_SPAdes_3/ describe the structure of the results directories.
Completed at: 24-Jan-2022 23:03:46
Duration    : 2m 12s
CPU hours   : 3.1
Succeeded   : 5

The nextflow.log file doesn't look the same as with the test.log file, either; there's a lot more cached processes in the test.log (here, MUFFINtest is the test log and MUPHIN_SPAdes is the run log). Also, this does not apply to metaflye -- this is only an issue for metaSPAdes. Am I accidentally asking MUFFIN to skip metaSPAdes or something,,,?

MUFFINtest_metaspades_1.log.txt

MUPHIN_SPAdes.log.txt

replikation commented 2 years ago

@marade can you tell me if you have an actually assembly.

@both of you. if the pipeline just stops it usually means there was no output generated. this would come down to your data. e.g. missing read data? or not enough eads?

marade commented 2 years ago

I do get a _polished.fasta file, so Medaka appears to succeed.

dzolier commented 2 years ago

I have files in assembly/quality_control/nanopore and assembly/quality_control/illumina. My nanopore reads also have the suffix _all, so I would guess they finished the merge step, and my illumina reads have _clean appended so I think fastp finished as well.

How many reads is not enough? Because when I use MetaFlye, I get an error I got without MUFFIN -- No disjointigs were assembled -- which I took to mean that we have fairly low ONT coverage on these reads

RVanDamme commented 2 years ago

Hello,

I will investigate this and come back to both of you ASAP

dzolier commented 2 years ago

Hello,

I ran metaSPAdes independently and I have some results. I'd like to see if I can drop the output into the MUFFIN output folder and trick MUFFIN into thinking it's already done the SPAdes step so it jumps past whatever the problem is.

Is there a place I can find the way the file structure changes as MUFFIN works through the pipeline? I tried just adding a quality_control/nanopore folder into the existing read files, but it didn't seem to work.

Thank you for your help

replikation commented 2 years ago

@all fixed the old nextflow bug, didnt had issues with medaka, though.

replikation commented 2 years ago

usually, if a process does not start it means no output was generated or input is missing