RVanDamme / MUFFIN

hybrid assembly and differential binning workflow for metagenomics, transcriptomics and pathway analysis
GNU General Public License v3.0
65 stars 11 forks source link

trying to run own data after test run OK #34

Closed drelo closed 1 month ago

drelo commented 2 years ago

I am trying to understand something about the pipeline in order to scale up and run it over 16 metagenomes for which we have illumina and nanopore data. I think I have two questions, one about initializing the run with my own data. A second one about how to reuse the images downloaded via singularity.

The test run went fine. then I renamed the results folder and tried to run it again with my own data. nextflow run RVanDamme/MUFFIN -profile local,singularity --illumina /illumina/ --ont /nanopore/ --assembler metaspades --cpus 20 --memory 200g --modular assemb-class --name 014 --output P014. The code executed in a few seconds and the results have no 'result' like in the test run.

The illumina files are: P014_R1.fastq P014_R2.fastq The nanopore files are: P014.fastq

Here is the the log on the screen

I am trying to understand how MUFFIN works. The only way to make the pipeline work again was removing all the folders and run again nextflow run RVanDamme/MUFFIN -profile local,singularity,test --cpus 25 --memory 300g Now after the test run a second time I have 2 folders that I thought I could retain, I could point to the folder nextflow-autodownload-databases with the paths in a .yml file in the future but how could I point to the work/singularity folder? Can I reuse the images downloaded in work/singularity so I don't have to download them again?

How can I manage to run MUFFIN with my own data. I wonder if put something wrong. I checked issues #33 and #30 which sounded similar, but I don't know where to start to diagnose this. Let me know of any log file I can provide or test I can perform to fix this. Any help would be appreciated Cheers


drelo commented 2 years ago

Sorry for the double post I updated with all the details of the issue in the post above. Thanks for your help.

replikation commented 2 years ago


i updated the singularity profile. You can use e.g. --cachedir dir/ to specify a location to store and use the singularity images. default location would be ./singularity_images. The changes are in the current master if it works on your end I could update the help and create a release. Currently don't have the time to properly test it as I don't have a singularity env available.

drelo commented 2 years ago

Dear Christian,

Thanks for your time with this, the commit or improvement worked fine, I run again the test with the new version [939ff8ca71] and now (after renaming the results folder) I could start again skipping the download of the databases or images, it went directly to process the samples.

nextflow run RVanDamme/MUFFIN --output results_dir --cpus 30 --memory 200g -profile local,singularity,test --cachedir ./singularity_images/ --sourmash_db ./nextflow_autodownload-databases/sourmash/genbank-k31.lca.json.gz --eggnog_db nextflow-autodownload-databases/eggnog/eggnog-db/eggnog.db

N E X T F L O W ~ version 21.04.1 Launching RVanDamme/MUFFIN [maniac_newton] - revision: 939ff8ca71 [master] [- ] process > test [ 0%] 0 of 1 [- ] process > discard_short - [- ] process > merge - [- ] process > fastp - [- ] process > spades - [- ] process > minimap2 - executor > local (2) [a0/9bba7a] process > test [ 0%] 0 of 1

Now I tried to run my data but it ends quickly

nextflow -log muf.log run RVanDamme/MUFFIN --output hibrido --cpus 30 --memory 200g -profile local,singularity --cachedir ./singularity_images/ --sourmash_db ./nextflow_autodownload-databases/sourmash/genbank-k31.lca.json.gz --eggnog_db nextflow-autodownload-databases/eggnog/eggnog-db/eggnog.db --modular assemb --illumina ./illumina/ --ont ./nanopore/

N E X T F L O W ~ version 21.04.1 Launching RVanDamme/MUFFIN [hungry_morse] - revision: 939ff8ca71 [master] executor > local (1) [3f/54840a] process > readme_output [ 0%] 0 of 1 executor > local (1) [3f/54840a] process > readme_output [ 0%] 0 of 1 executor > local (1) [3f/54840a] process > readme_output [100%] 1 of 1 ✔ executor > local (1) [3f/54840a] process > readme_output [100%] 1 of 1 ✔

Start running MUFFIN MUFFIN is a hybrid assembly and differential binning workflow for metagenomics, transcriptomics and pathway analysis.

If you use MUFFIN for your research pleace cite:



Van Damme R., Hölzer M., Viehweger A., Müller B., Bongcam-Rudloff E., Brandt C., 2020 "Metagenomics workflow for hybrid assembly, differential coverage binning, transcriptomics and pathway analysis (MUFFIN)", doi: https://doi.org/10.1101/2020.02.08.939843

Done! Results are stored here --> hibrido The Readme file in hibrido describe the structure of the results directories.

Could you help me to understand what is wrong so I can run this? In the meantime I will try with the profile local,conda Thanks in advance. Best


The illumina files are: P014_R1.fastq P014_R2.fastq The nanopore files are: P014.fastq

Here is the log Here is the execution report

replikation commented 2 years ago
drelo commented 2 years ago

Hi again, thanks for your help.

Illumina == P014_R1.fastq P014_R2.fastq Nanopore == P014.fastq I was using this --illumina ./illumina/ --ont ./nanopore/

I just tried providing the full path and also gave the path within a .yml file but that didn't work.

nextflow run RVanDamme/MUFFIN -profile local,singularity --cachedir ./singularity_images/ --sourmash_db ./nextflow_autodownload-databases/sourmash/genbank-k31.lca.json.gz --eggnog_db nextflow-autodownload-databases/eggnog/eggnog-db/eggnog.db -params-file PAR.yml

assembler : "metaspades" ouptut : "/mnt/cive/andres/muffin/muffins" illumina : "/mnt/cive/andres/muffin/illumina" ont : "/mnt/cive/andres/muffin/nanopore" cpus : 30 memory : "200g" modular : "assemb-class"
