bxlab / metaWRAP

MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis
MIT License
393 stars 191 forks source link

binning refinement error message: IndexError: list index out of range #360

Open Valentin-Bio-zz opened 3 years ago

Valentin-Bio-zz commented 3 years ago

I ran binning refinement module using my 3 binning sets , the programs that I used were groopm2, maxbin2 and metabat2,

In first place, maxbin2 gave me 312 bins, a lot higher bins than groopm2 and metabat2 (95 and 74 bins respectively) , could this be a problem ?

second I got the following error message:

The number of refined bins: 60 Exporting refined bins... Add folder/bin name to contig name for binsB bins Add folder/bin name to contig name for binsC bins Add folder/bin name to contig name for binsB bins Add folder/bin name to contig name for binsC bins Combine all bins together Combine all bins together Combine all bins together Traceback (most recent call last): File "/datos/vberrios/metaWRAP/bin/metawrap-scripts/binning_refiner.py", line 193, in bin_name = each_id_split[1] IndexError: list index out of range Traceback (most recent call last): File "/datos/vberrios/metaWRAP/bin/metawrap-scripts/binning_refiner.py", line 193, in bin_name = each_id_split[1] IndexError: list index out of range Traceback (most recent call last): File "/datos/vberrios/metaWRAP/bin/metawrap-scripts/binning_refiner.py", line 193, in bin_name = each_id_split[1] IndexError: list index out of range Extracting refined bin: Refined_60.fasta Deleting temporary files

what can be wrong? could that be related to the difference between number of bins produced by the different binning programmes?

ursky commented 3 years ago

It should be able to handle different bin counts no problem... Can you check the intermediate bin files (i.e. AB, ABC, etc) to see if every intermediate has some bins in it? From the error I think one of the preliminary bin sets is empty for some reason.

ursky commented 3 years ago

Also is it possible that one of the bin sets changed the contig naming?

Valentin-Bio-zz commented 3 years ago

Thanks for your answer, I checked the Refined ABC, AB ,BC ,AC folders and only the AC folder has the "Refined" directory , in that directory there are fasta files which the contig names matching the contig names of the coassembly fasta file (provided by the assembler). the directories bins{A,B,C} all have fasta files with contig names matching contig names of the coassembly fasta file.

providing here the full error message: https://www.dropbox.com/s/y00tqo3i1o70h3c/refinement.error.out?dl=0

ursky commented 3 years ago

I'm not sure... The error is obviously coming from Binning_refiner, but this is the first time I'm having issues with this. My only advice would be to check if there is anything potentially strange about the naming of either the bins or your contig naming that could be throwing Binning_refiner off.

Valentin-Bio-zz commented 3 years ago

Ok now definitely must be something wrong with the Maxbin2 bins, this is because checking the combined bin sets (AB, ABC, BC, AC), the only combined bin set that actually have bins is the BC, so all combinations produced using the "A" bins (maxbin2 in my case) is producing the error. The maxbin2 bins have the same contig names that appear on the assembly fasta so the problem must be something else

tmmendoza-mbb commented 1 year ago

Hi! Just wanted to ask if you were able to solve this problem. I am encountering the same problem regarding the missing "Refined" directories inside the supposed combined bins. I am also combining bins from metabat and maxbin2. I double checked the contig naming within each bins before running, and ensured that they are the same.

jtamames commented 1 year ago

Hello I also have this issue, but I was able to fix it. It seems that metaWRAP expects to find just bins in the bins directory specified by -A, -B, -C. But sometimes the binners put other things in them, like temporary files, files with counts, etc. This seems not to be a problem for concoct and metabat2, but maxbin2 directory contains files that metaWRAP wrongly identifies as bins. Removing these files solves the issue. A nice fix for this would be to check just for .fa, .fasta or .fna files in the binner's directories. Best, J