Closed mrvollger closed 4 years ago
HI Mitchell, Sorry for the confusion but that is expected. Please see Fig. 4 in https://academic.oup.com/nar/article/44/19/e147/2468393 We gave priority to the reference because the reference (often from Canu or a similar assembler) used to be more accurate. If you want to use query sequence in your final assembly, you may have to chop the reference contig into smaller fragments ( so that it is less contiguous than the query) and then use that reference as a query.
On Sat, Apr 25, 2020 at 8:13 PM Mitchell Robert Vollger < notifications@github.com> wrote:
Hi,
I am merging two assemblies and I am finding that the merged output is dominated by the reference and not the query which is not what I expected based on the documentation.
Below I am showing my contig(s) aligned to chrX in hg38 for both my reference and query. The contigs are colored in alternating blue and orange bars.
My reference looks like: [image: image] https://user-images.githubusercontent.com/6935283/80296409-d7aeef00-872f-11ea-8d86-a89b634f9b2e.png My query looks like: [image: image] https://user-images.githubusercontent.com/6935283/80296421-ef867300-872f-11ea-96f2-2c20b079c45f.png And my merge looks like: [image: image] https://user-images.githubusercontent.com/6935283/80296444-2fe5f100-8730-11ea-9d83-73e9b4da23d6.png
This looks great to start; however, if I align my merged sequence back to the reference sequence I find that the bases match perfectly implying that none of the bases from the query were used. I thought that the reference would only be used to bridge the gaps in the query, not totally replace the smaller contigs from the query. Is this the expected behavior?
Thanks! Mitchell
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mahulchak/quickmerge/issues/53, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZQH2GF5QZ5MJRXMSD4STDROORDRANCNFSM4MRA66RQ .
-- Mahul Chakraborty Department of Ecology and Evolutionary Biology University of California-Irvine Phone: 949 824 9559 Fax: 949 824 9559 Website: https://mahulchakraborty.wordpress.com/ Github: https://github.com/mahulchak
I should have looked at the paper, thanks for the clarification.
I like your suggestion and I may try it. Thanks!
You might consider changing/clarifying this line on your wiki page:
So the merged assembly receives the most sequences from the query assembly, and the reference assembly provides only the sequences that bridge gaps in the query assembly.
Cheers, Mitchell
Hi,
I am merging two assemblies and I am finding that the merged output is dominated by the reference and not the query which is not what I expected based on the documentation.
Below I am showing my contig(s) aligned to chrX in hg38 for both my reference and query. The contigs are colored in alternating blue and orange bars.
My reference looks like: My query looks like: And my merge looks like:
This looks great to start; however, if I align my merged sequence back to the reference sequence I find that the bases match perfectly implying that none of the bases from the query were used. I thought that the reference would only be used to bridge the gaps in the query, not totally replace the smaller contigs from the query. Is this the expected behavior?
Thanks! Mitchell