jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
374 stars 80 forks source link

ERROR in STEP14 -> 14.bin_maxbin.pl #144

Closed anilchauhanhp9 closed 4 years ago

anilchauhanhp9 commented 4 years ago

Hi... I am running SqueezeMeta and got an error at step 14 The whole error message was as follows "Error running command: perl /home/anil/SqueezeMeta/bin/MaxBin/run_MaxBin.pl -thread 12 -contig /home/anil/WGM1/temp/bincontigs.fasta -abund_list /home/anil/WGM1/results/maxbin/abund.list -out /home/anil/WGM1/results/maxbin/maxbin -markerpath /home/anil/SqueezeMeta/database/marker.hmm at /home/anil/SqueezeMeta/scripts/14.bin_maxbin.pl line 123. wc: /home/anil/WGM1/results/maxbin/: Is a directory WARNING in STEP14 -> 14.bin_maxbin.pl. No MaxBin results! wc: /home/anil/WGM1/results/metabat2/: Is a directory WARNING in STEP15 -> 15.bin_metabat2.pl. No Metabat2 results! Error running command: LD_LIBRARY_PATH=/home/anil/SqueezeMeta/lib PATH=/home/anil/SqueezeMeta/bin:$PATH /home/anil/SqueezeMeta/bin/DAS_Tool/DAS_Tool -i -l -c /home/anil/WGM1/results/01.WGM1.fasta --write_bins 1 --score_threshold 0.25 --search_engine diamond -t 12 -o /home/anil/WGM1/results/DAS/WGM1 --db_directory /home/anil/SqueezeMeta/database at /home/anil/SqueezeMeta/scripts/16.dastool.pl line 85. Can't open /home/anil/WGM1/results/DAS/WGM1_DASTool_bins directory, no DAStool results WARNING: File /home/anil/WGM1/results/DAS/WGM1_DASTool_bins/ is empty!. DAStool did not generate results Skipping BIN TAX ASSIGNMENT: DAS_Tool did not predict bins" Please help me with the issue

fpusan commented 4 years ago

Hi! Can you try running the following command:

perl /home/anil/SqueezeMeta/bin/MaxBin/run_MaxBin.pl -thread 12 -contig /home/anil/WGM1/temp/bincontigs.fasta -abund_list /home/anil/WGM1/results/maxbin/abund.list -out /home/anil/WGM1/results/maxbin/maxbin -markerpath /home/anil/SqueezeMeta/database/marker.hmm

And tell us what the error message looks like?

anilchauhanhp9 commented 4 years ago

After running this command i got this message MaxBin 2.2.6 Thread: 12 Input contig: /home/anil/WGM1/temp/bincontigs.fasta out header: /home/anil/WGM1/results/maxbin/maxbin Located abundance file [/home/anil/WGM1/results/maxbin/WGM1.abund] Searching against 107 marker genes to find starting seed contigs for [/home/anil/WGM1/temp/bincontigs.fasta]... Try harder to dig out marker genes from contigs. Marker gene search reveals that the dataset cannot be binned (the medium of marker gene number <= 1). Program stop.

fpusan commented 4 years ago

This does not seem like an error specific to SqueezeMeta. Rather, MaxBin thinks that your data can not be binned. The fact that in the next step Metabat2 (which is a different binning program) also returns no bins makes me think that there is something weird happening with your data.

Actually, we ourselves have reported a similar issue to MaxBin developers in the past: https://sourceforge.net/p/maxbin2/tickets/6/

Try running MaxBin again , adding the -min_contig_length 200 flag at the end of the command.

anilchauhanhp9 commented 4 years ago

Hi i ran following command as per your suggestion "perl /home/anil/SqueezeMeta/bin/MaxBin/run_MaxBin.pl -thread 12 -contig /home/anil/WGM1/temp/bincontigs.fasta -abund_list /home/anil/WGM1/results/maxbin/abund.list -out /home/anil/WGM1/results/maxbin/maxbin -markerpath /home/anil/SqueezeMeta/database/marker.hmm -min_contig_length 200" it gave me this error MaxBin 2.2.6 Thread: 12 Input contig: /home/anil/WGM1/temp/bincontigs.fasta out header: /home/anil/WGM1/results/maxbin/maxbin Min contig length: 200 Located abundance file [/home/anil/WGM1/results/maxbin/WGM1.abund] Searching against 107 marker genes to find starting seed contigs for [/home/anil/WGM1/temp/bincontigs.fasta]... Try harder to dig out marker genes from contigs. Marker gene search reveals that the dataset cannot be binned (the medium of marker gene number <= 1). Program stop.

fpusan commented 4 years ago

Then it seems that something weird happening with your data. Can you send me the following files? /path/to/your/project/results/10.*.mappingstat /path/to/your/project/results/11.*.mcount

And, if present, also /path/to/your/project/results/22.*.stats

anilchauhanhp9 commented 4 years ago

mappingstat and mcount.tar.gz The files you asked for are attached

fpusan commented 4 years ago

It seems that, for the most part, all your contigs are classified as "unknown", which may explain why MaxBin is not finding marker genes. It is interesting, because most of the reads seem to map back to the assembly, indicating that the problem is not related to a bad assembly. What kind of samples are you working with?

anilchauhanhp9 commented 4 years ago

Sir they are the soil samples taken from glacier forefield and sequencing was done with Illumina

fpusan commented 4 years ago

Hi! Is your data by any chance consisting only on 16S/18S amplicons?

anilchauhanhp9 commented 4 years ago

No, its whole genome metagenome data. But just asking for my knowledge consider if my data is of 16/18S amplicon then what will happen..? will the process stops at binning step...?

fpusan commented 4 years ago

If you only had amplicons, I think MaxBin would have not found the marker genes it needs.

anilchauhanhp9 commented 4 years ago

ok thank you for the information.

fpusan commented 4 years ago

Closing due to lack of activity, feel free to reopen