mahulchak / quickmerge

A simple and fast metassembler and assembly gap filler designed for long molecule based assemblies.
GNU General Public License v3.0
198 stars 31 forks source link

Does running iterations of quickmerge on quickmerge output create artifacts? #56

Open 000generic opened 4 years ago

000generic commented 4 years ago

I have multiple assemblies of hybrid and long-read sequencing. And its allowing me to run quickmerge on the output of quickmerge multiple times. I'm seeing increases in contig lengths and N50 but wondering how much of this might be due to artifacts. I'm guessing artifacts are unlikely in cases of gap filling - but is quickmerge able to add sequence onto the ends of contigs - in which case, maybe artifacts might arise over iterations of quickmerge use...?

Thank you :)

mahulchak commented 4 years ago

I would be very cautious if I noticed sudden big jumps in contiguity that did not happen with other merging steps. Use stringent ml or l to make sure that sequences are not getting added due to repeats. You can also use an alignment length filter (in delta-filter) to filter out small alignments.

If you see that couple of joins are increasing the contiguity and if you suspect misjoins being the cause, you can inspect the merged contigs.

On Sat, May 16, 2020 at 10:32 PM Eric Edsinger notifications@github.com wrote:

I have multiple assemblies of hybrid and long-read sequencing. And its allowing me to run quickmerge on the output of quickmerge multiple times. I'm seeing increases in contig lengths and N50 but wondering how much of this might be due to artifacts. I'm guessing artifacts are unlikely in cases of gap filling - but is quickmerge able to add sequence onto the ends of contigs - in which case, maybe artifacts might arise over iterations of quickmerge use...?

Thank you :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mahulchak/quickmerge/issues/56, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZQH2BBNWMWX3Y3YDFHEWLRR5ZB3ANCNFSM4NDHHQAA .

-- Mahul Chakraborty Department of Ecology and Evolutionary Biology University of California-Irvine Phone: 949 824 9559 Fax: 949 824 9559 Website: https://mahulchakraborty.wordpress.com/ Github: https://github.com/mahulchak

000generic commented 4 years ago

I set l to N50 and ml to 10,000 - but maybe that is too low. I'll try to explore quickmerge output -still getting used to things like sam and delta files. Maybe I will just go with the likely best assembly after just one round of quickmerge.

We will run Dovetail on things after scaffolding - so hopefully it can also fix misjoins by quickmerge.

I have two hybrid and two long-read assemblies from 4 different assemblers. What if I were to pool 3 initial assemblies to act as the self-assembled long-read assembly - and then do a single round of quickmerge on the best of the initial hybrid assemblies....? Maybe this would let me leverage all the contiguity without inducing excessive artifacts from iterations of quickmerge.

mahulchak commented 4 years ago

Sounds good.

I guess you could try that strategy. I have not tried that before so I will be curious to see what happens.

On Sat, May 16, 2020, 22:48 Eric Edsinger notifications@github.com wrote:

I set l to N50 and ml to 10,000 - but maybe that is too low. I'll try to explore things -still getting used to things like sam and delta files. Maybe I will just go with the likely best after just one round of quickmerge.

We will run Dovetail on things after scaffolding - so hopefully it can also fix misjoins by quickmerge.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mahulchak/quickmerge/issues/56#issuecomment-629747203, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZQH2B3Z2J3NU4F3WJP4DDRR53BFANCNFSM4NDHHQAA .