rhysnewell / rosella

Metagenomic Binning Algorithm
BSD 3-Clause "New" or "Revised" License
38 stars 3 forks source link

question about the refine module of rosella #52

Open quliping opened 8 months ago

quliping commented 8 months ago

Rosella is a good binning software, but I have a question about the workflow of the refine module of rosella. Is it just refine a single set of bins? Could it combine different bin sets from several binning software to a single and non redundant bin set, just like what DAS tool/binning _refiner/metawrap do? If I put all bins (e.g., bins from three different binning tools) to the input folder of rosella, will rosella choose a best version for redundant bins?

Thanks.

rhysnewell commented 8 months ago

Hello,

Thanks for your question. Currently, no Rosella will not choose a best result when provided output from multiple different binning programs. You can however, run rosella refine on the input and the output of DASTool to better those results as is done in aviary

I might try and include this sort of functionality in future, but it is not a priority at the moment as it is quite a complex problem to tackle.

Cheers, Rhys

quliping commented 8 months ago

Thanks for your reply. I will try other bin refinement tools. But I still need your help. Here is the log and file list in the output dir of rosella, I'm not sure if the run is complete, because I got extremely poor binning results. The second and third figures are the checkm2 result of rosella and semibin, respectively. For my samples, the rosella binning results even worse than some traditional binning tools just like metabat2, maxbin2 (data not shown). What's wrong? Is there any to improve the rosella binning results? list.txt

image

image

image

Sincerely,

Lping, Qu

marianamnoriega commented 7 months ago

Hello!! I am also currently trying rosella for a couple of samples and it ran without problems, however :) , I would also appreciate some further documentation on the log file from rosella since it doesn't indicate that the software finished successfully. I didn't specify any refinment step is it done by default? It's also not entirely clear for me what the "refined" bins are, does this mean that I don't need any other refinement tool afterwards? For some reason the number of "refined" bins that i have is larger than the number of not refined bins (roughly 66 refined and 58 bins). I would very much appreciate any light you can shed on this!

Thanks a lot!

rhysnewell commented 7 months ago

Hi @quliping,

Sorry for the delayed response here, I was on holiday and missed your response. The poor performance of rosella is confusing, I'd like to dig a bit deeper into that if possible. Can you post the assembly statistic for your assembly? And also the rosella command that you used? It might be that you need to incorporate shorter contigs into the binning process, by default I think rosella only uses contigs >2500 bp but if you have a heavily fragmented assembly you may need to lower this.

@marianamnoriega I'll add log message specifying successful runs, but if you didn't see any error messages then it likely ran without issue. A single round of refinement is performed by default as part of the initial binning process, hence why you end up with some MAGs with refined in the name. You can still run further refinement on these MAGs with tools like DASTool, as is done in aviary. The number of refined bins being higher than non-refined bins is normal, not something to worry about