Closed zeeev closed 8 years ago
See manpage: minimap.1.
255 is the mapping quality, but minimap does not compute it.
More Qs:
Are you chaining in both directions? i.e. no overlapping query?
-O depends on -r right? Is -r only in query space?
Thanks.
Are you chaining in both directions?
Yes. Requiring co-linearity.
i.e. no overlapping query?
In one hit, the minimizers are strictly co-linear. However, different hits may have overlaps on query – if that is what you mean by overlapping.
-O depends on -r right?
Yes.
Is -r only in query space?
-r is in the "diagonal space". Say query and database sequences are on the same strand (for simplicity). Hits are clustered based on "x-y", where x is the coordinate on the query and y the coordinate on the database sequence.
different hits may have overlaps on query
By this I mean minimap is a multi-mapper. BWA-MEM is a best mapper by default. BWA-MEM may work as a multi-mapper, but it often misses hits when there are more than several similar hits.
I'm mapping human contigs in the Mb+ size to grch38. I'm using minimap to identity large stretches of contiguity then using BWA mem to align through those regions. I'm concerned about loosing large inversions and deletions if I only use BWA mem. Is that reasonable? I'd like to not have overlapping contigs relative to the reference genomes.
I don't have enough experience to give a good recommendation. My gut feeling is for large events, minimap will do better as it gives you most long hits. However, if you use the default -r, you may lose smaller events as those minimizer matches will be grouped with the larger chain. Also, bwa-mem does not work well for Mb+ contigs. Too slow. I believe BLASR will be better.
Thanks for all your help.
minimap followed by blasr or mem should recover all of the small events.
Dear Heng,
I'm not sure the output matches the README. I'm guessing the output format is:
000000F_quiver_patched 36230911 1450029 1466108 - chr19 58617616 192435 209606 8619 17171 255 cm:i:1248
Thanks.
--Zev