rhysnewell / aviary

A hybrid assembly and MAG recovery pipeline (and more!)
GNU General Public License v3.0
81 stars 12 forks source link

Hybrid spades assembly question #206

Open aleuQUT opened 4 months ago

aleuQUT commented 4 months ago

Hey Rhys

I was wondering why the same long reads used for the metaflye assembly are being used for the spades hybrid assembly with Short reads that did not map to the long read assembly. Wouldn't it make sense to use only low quality long reads?

Cheers, Andy

rhysnewell commented 4 months ago

Hey Andy,

It's been awhile since I've looked at that section but I've thought about this a fair bit in the past, and there are a couple of reasons:

1) To try and maximise any connections between the very fragmented secondary spades assembly:

2) This is how the OG slamM assembly pipeline was laid out. It would use all the long reads in the spades assembly. It would also then follow this up with use of unicycler which I've made optional because it is slow and doesn't improve results. Unicycler also doesn't fix the "doubling up" usage problem, if it did occur.

I don't think either of these are good enough reasons to keep it as is. It would be trivial to filter out the long reads that haven't been used in assembly or polishing (I think?) without mapping again, but it might be better long term to try and thread re-used reads back into the main assembly.

I guess I was never sure what the best method was and never got around to a better solution. If you think that the current method should be done differently then I'm happy to figure out how to get that implemented. I don't think we should limit it to just "low quality" long reads though, I'm guessing you just mean "previously unused" long reads? Keen to hear yours and others thoughts on the matter

Cheers, Rhys

AroneyS commented 4 months ago

I guess alignment of the relevant contigs (those with reused long-reads) might work. But what do you do if they disagree? Like if the same long-read is present in incongruent contigs from metaflye and metaspades? Preferentially dump metaspades?