RVanDamme / MUFFIN

hybrid assembly and differential binning workflow for metagenomics, transcriptomics and pathway analysis
https://rvandamme.github.io/MUFFIN_Documentation/#introduction
GNU General Public License v3.0
65 stars 11 forks source link

NO VALID EXECUTION PROFILE SELECTED, use e.g. [-profile local,docker] #15

Closed Jessica-2345 closed 3 years ago

Jessica-2345 commented 3 years ago

Hello, I'm trying to use MUFFIN with nanopore and Illumina data but the error : N E X T F L O W ~ version 20.07.1 Launching RVanDamme/MUFFIN [fabulous_colden] - revision: 5805e280b0 [master] NO VALID EXECUTION PROFILE SELECTED, use e.g. [-profile local,docker] appear whatever execution profile I use. I have conda install and the test is running correctly ./nextflow run RVanDamme/MUFFIN --output results_dir --assembler metaflye -profile local,conda,test

Do you know what could be the issue with my command line? nextflow run RVanDamme/MUFFIN --output results_dir --cpus 24 --memory 32g --assembler metaspades --illumina caves/fastq-merged/ --ont caves/nanopore/ --profile local,conda

Thanks in advance, Jessica

replikation commented 3 years ago

Thanks for the interest in muffin. Its -profile not --profile ;)

Jessica-2345 commented 3 years ago

Now that I write the correct command, I have a problem with 'maxbin2' I've try the troubleshooting procedure but I have the same issue. Caused by: Process maxbin2 (1) terminated with an error exit status (255)

Command executed:

run_MaxBin.pl -contig assembly.fasta -reads 914-Mo_R1_clean.fastq -reads2 914-Mo_R2_clean.fastq -reads3 914-Mo_all.fastq -out maxbin2 -thread 8 mkdir maxbin_bin mv maxbin2.*.fasta maxbin_bin/

Command exit status: 255

Command output: MaxBin 2.2.7 Input contig: assembly.fasta Located reads file [914-Mo_R1_clean.fastq] Located reads file [914-Mo_R2_clean.fastq] Located reads file [914-Mo_all.fastq] out header: maxbin2 Thread: 8 Running Bowtie2 on reads file [914-Mo_R1_clean.fastq]...this may take a while... Reading SAM file to estimate abundance values... Running Bowtie2 on reads file [914-Mo_R2_clean.fastq]...this may take a while... Reading SAM file to estimate abundance values... Running Bowtie2 on reads file [914-Mo_all.fastq]...this may take a while... Reading SAM file to estimate abundance values... Searching against 107 marker genes to find starting seed contigs for [assembly.fasta]... Running FragGeneScan.... Running HMMER hmmsearch.... Try harder to dig out marker genes from contigs. Marker gene search reveals that the dataset cannot be binned (the medium of marker gene number <= 1). Program stop.

Command error: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "C.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C").

Work dir: /scratch/users/j/o/jody/MUFFIN/work/8e/272392af66ef67827e87733a919dee

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh Oops .. something went wrong [c7/e495ee] Submitted process > bwa (2) WARN: Killing pending tasks (1)

executor > local (19) [skipped ] process > sourmash_download_db [100%] 1 of 1, stored: 1 ✔ [skipped ] process > checkm_download_db [100%] 1 of 1, stored: 1 ✔ [aa/d36c53] process > checkm_setup_db [100%] 1 of 1 ✔ [01/ce062a] process > discard_short (2) [100%] 4 of 4 ✔ [30/6866a6] process > merge (2) [100%] 3 of 3 ✔ [18/c441d4] process > fastp (1) [100%] 3 of 3 ✔ [aa/318f82] process > spades (3) [100%] 3 of 3 ✔ [7b/70507b] process > minimap2 (1) [ 33%] 1 of 3 [c7/e495ee] process > bwa (2) [ 50%] 1 of 2 [- ] process > metabat2 [ 0%] 0 of 1 [8e/272392] process > maxbin2 (1) [ 33%] 1 of 3, failed: 1 [- ] process > concoct [ 0%] 0 of 1 [- ] process > refine3 - [- ] process > checkm - [- ] process > sourmash_bins - [- ] process > sourmash_checkm_parser - [skipped ] process > eggnog_download_db [100%] 1 of 1, stored: 1 ✔ [- ] process > eggnog_bin - [- ] process > parser_bin - [be/aac0ad] process > readme_output [100%] 1 of 1 ✔ Error executing process > 'maxbin2 (1)'

Caused by: Process maxbin2 (1) terminated with an error exit status (255)

Command executed:

run_MaxBin.pl -contig assembly.fasta -reads 914-Mo_R1_clean.fastq -reads2 914-Mo_R2_clean.fastq -reads3 914-Mo_all.fastq -out maxbin2 -thread 8 mkdir maxbin_bin mv maxbin2.*.fasta maxbin_bin/

Command exit status: 255

Command output: MaxBin 2.2.7 Input contig: assembly.fasta Located reads file [914-Mo_R1_clean.fastq] Located reads file [914-Mo_R2_clean.fastq] Located reads file [914-Mo_all.fastq] out header: maxbin2 Thread: 8 Running Bowtie2 on reads file [914-Mo_R1_clean.fastq]...this may take a while... Reading SAM file to estimate abundance values... Running Bowtie2 on reads file [914-Mo_R2_clean.fastq]...this may take a while... Reading SAM file to estimate abundance values... Running Bowtie2 on reads file [914-Mo_all.fastq]...this may take a while... Reading SAM file to estimate abundance values... Searching against 107 marker genes to find starting seed contigs for [assembly.fasta]... Running FragGeneScan.... Running HMMER hmmsearch.... Try harder to dig out marker genes from contigs. Marker gene search reveals that the dataset cannot be binned (the medium of marker gene number <= 1). Program stop.

Command error: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "C.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C").

Work dir: /scratch/users/j/o/jody/MUFFIN/work/8e/272392af66ef67827e87733a919dee

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

Thanks you

RVanDamme commented 3 years ago

Hello,

In this very particular case, the issue seems to be that Maxbin cannot find marker genes in your contigs.

This can be because there is no marker genes (low probability but still a probability), as I don't know your data I can't infirm or confirm this.

Another would be that you don't have enough contig with marker genes, or contig long enough to be taken into consideration by Maxbin2.

To solve the second possibility, you can either skip using Maxbin ( --skip_maxbin2 in the running command and don't forget the -resume to not redo everything ) or edit the threshold size for Maxbin to accept the contigs as valid (the default is 1000 bases you can change to e.g. 200 bases adding -min_contig_length 200 in the maxbin2.nf file -->MUFFIN/modules/maxbin2.nf )

Jessica-2345 commented 3 years ago

I used --skip_maxbin2 but I have this issue now: Error executing process > 'refine2 (1)'

Caused by: Unknown variable 'name' -- Make sure it is not misspelt and defined somewhere in the script before using it

Source block: """ mem=\$(echo ${task.memory} | sed 's/g//g') path_db=\$(cat ${path}) echo \$path_db echo -e "\$path_db" | checkm data setRoot echo "checkm done" metawrap bin_refinement -t ${task.cpus} -m \$mem -o refined_bins -A ${bins1} -B ${bins2} -o refined_bins mkdir metawrap_bins/ mv refined_bins/metawrap_70_10_bins/*.fa metawrap_bins/ mv refined_bins/metawrap_70_10_bins.stats ${name}_binning_stats.txt """

As I don't know where the variable name is create I don't know what went wrong. Thanks you for your help

RVanDamme commented 3 years ago

Thanks for reporting the issue. The issue as been fixed in the main branch of Github. You should be able to run the command from github without issue. If you have downloaded MUFFIN please do a git pull to fix the issue in your repository

Jessica-2345 commented 3 years ago

Dear @RVanDamme, Now, it seems as CheckM as an issue.

Error executing process > 'refine2 (1)'

Caused by: Process refine2 (1) terminated with an error exit status (1)

Command executed:

mem=$(echo 10 GB | sed 's/g//g') path_db=$(cat path_db.txt) echo $path_db echo -e "$path_db" | checkm data setRoot echo "checkm done" metawrap bin_refinement -t 8 -m $mem -o refined_bins -A bins_dir -B fasta_bins -o refined_bins mkdir metawrap_bins/ mv refined_bins/metawrap_70_10_bins/*.fa metawrap_bins/ mv refined_bins/metawrap_70_10_bins.stats 712-Mo_binning_stats.txt

Command exit status: 1

Command output: /scratch/users/j/o/jody/MUFFIN/nextflow-autodownload-databases/checkm/db It seems that the CheckM data folder has not been set yet or has been removed. Running: 'checkm data setRoot'. Where should CheckM store it's data? Please specify a location or type 'abort' to stop trying:

Path [/scratch/users/j/o/jody/MUFFIN/nextflow-autodownload-databases/checkm/db] exists and you have permission to write to this folder. (re) creating manifest file (please be patient). Where should CheckM store it's data? Please specify a location or type 'abort' to stop trying:

Unexpected error: <type 'exceptions.EOFError'>

Command error:


[CheckM - data] Check for database updates. [setRoot]


Thank you, Jessica

RVanDamme commented 3 years ago

Hi,

Can you provide the nextflow.log file and the .command.* files that are in the process dir (/work/??/?????/). Also can you tell me if you use a compressed or uncompressed Checkm db as well as providing the command you use.

Thanks Renaud

Jessica-2345 commented 3 years ago

Hi,

The command I used is this one nextflow run ./MUFFIN/main.nf --output results_dir --cpus 8 --memory 10g --skip_maxbin2 --assembler metaspades --illumina ../coassembly/illumina/ --ont ../caves/nanopore/ -profile local,conda

The Checkm database is compressed.

.command.log .nextflow.log .command.err.txt .command.log.txt .command.run.txt

Regards, Jessica

RVanDamme commented 3 years ago

Hi Jessica,

The issue should be fixed.

Regards Renaud

Jessica-2345 commented 3 years ago

Hello again, I still have two issues to report, First, with the same command line (nextflow run ./MUFFIN/main.nf --output results_dir --cpus 8 --memory 10g --skip_maxbin2 --assembler metaspades --illumina ../coassembly/illumina/ --ont ../caves/nanopore/ -profile local,conda) and the error was:

Error executing process > 'refine2 (1)'

Caused by: Process refine2 (1) terminated with an error exit status (1)

Command executed:

mem=$(echo 10 GB | sed 's/g//g') path_db=$(cat path_db.txt) echo $path_db echo -e "cat << EOF\n$path_db\nEOF\n" | checkm data setRoot echo "checkm done" metawrap bin_refinement -t 8 -m $mem -o refined_bins -A bins_dir -B fasta_bins mkdir metawrap_bins/ mv refined_bins/metawrap_70_10_bins/*.fa metawrap_bins/ mv refined_bins/metawrap_70_10_bins.stats 16-Me_binning_stats.txt

Command exit status: 1

Command output: /scratch/users/j/o/jody/MUFFIN/nextflow-autodownload-databases/checkm/db Where should CheckM store it's data? Please specify a location or type 'abort' to stop trying:

Path [cat << EOF] has been created and you have permission to write to this folder. (re) creating manifest file (please be patient). checkm done metawrap bin_refinement -t 8 -m 10 GB -o refined_bins -A bins_dir -B fasta_bins


----- Non-optional parameters -o and/or -A were not entered -----

Usage: metaWRAP bin_refinement [options] -o output_dir -A bin_folderA [-B bin_folderB -C bin_folderC] Note: the contig names in different bin folders must be consistant (must come from the same assembly).

Options:

    -o STR          output directory
    -t INT          number of threads (default=1)
    -m INT          memory available (default=40)
    -c INT          minimum % completion of bins [should be >50%] (default=70)
    -x INT          maximum % contamination of bins that is acceptable (default=10)

    -A STR          folder with metagenomic bins (files must have .fa or .fasta extension)
    -B STR          another folder with metagenomic bins
    -C STR          another folder with metagenomic bins

    --skip-refinement       dont use binning_refiner to come up with refined bins based on combinations of binner outputs
    --skip-checkm           dont run CheckM to assess bins
    --skip-consolidation    choose the best version of each bin from all bin refinement iteration
    --keep-ambiguous        for contigs that end up in more than one bin, keep them in all bins (default: keeps them only in the best bin)
    --remove-ambiguous      for contigs that end up in more than one bin, remove them in all bins (default: keeps them only in the best bin)
    --quick                 adds --reduced_tree option to checkm, reducing runtime, especially with low memory

Command error:


[CheckM - data] Check for database updates. [setRoot]


Data location successfully changed to: cat << EOF

I checked the result directory, and there is no output from Metabat2. bins_dir is completly empty. .command.log .nextflow.log

I thought that metabat2 had no legitim output so I tried nextflow run ./MUFFIN/main.nf --output results_dir --cpus 8 --memory 10g --skip_maxbin2 --skip_metabat2 --assembler metaspades --illumina ../coassembly/illumina/ --ont ../caves/nanopore/ -profile local,conda -resume as a command but I had the error:

N E X T F L O W ~ version 20.07.1 Launching ./MUFFIN/main.nf [pensive_wing] - revision: 46236d7881 [- ] process > sourmash_download_db - [- ] process > checkm_download_db - [- ] process > checkm_setup_db - No such variable: metawrap_out_ch

-- Check script 'MUFFIN/main.nf' at line: 463 or see '.nextflow.log' file for more details

RVanDamme commented 3 years ago

Hello! Sorry for the wait.

The first issue is due to the way I ordered the argument in the script so it's fixed now For "Checkm data setroot" was using a fix that is now outdated, I updated the use of the command. It should be fixed now.

As for the lack of bins in the metabat2 output, It can happen that with some data no bins are produced. That's why we use 3 different binning methods. But I also fixed the variable handling in case of multiple skipping.

You just have to git pull and everything should run now.