PacificBiosciences / pbbioconda

PacBio Secondary Analysis Tools on Bioconda. Contains list of PacBio packages available via conda.
BSD 3-Clause Clear License
247 stars 43 forks source link

pbmm2 give many alignments for one reads #618

Closed Soleilcode closed 9 months ago

Soleilcode commented 9 months ago

Operating system linux

Package name pbmm2 1.13.0

Using: pbmm2 : 1.13.0 (commit v1.13.0-2-gbcd99f5) pbbam : 2.4.99 (commit v2.4.0-23-g59248fe) pbcopper : 2.3.99 (commit v2.3.0-28-ga9b1ffa) boost : 1.81 htslib : 1.17 minimap2 : 2.26 zlib : 1.2.13

Describe the bug I am currently using pbmm2 align to map my subreads bam to hg38. I got chimeric reads by using 'grep "SA:Z:". But i found that pbmm2 gives many alignments for one reads. The mapped position of these reads just show a few or dozens of bases apart, but have different mapq values, and more than one has a mapq of 60. How can I find the best hits for further analysis? For example:

m64032_190606_013330/61472875/0_54847 2064 chr2 32916197 60 13392S5=1D8=…… m64032_190606_013330/61472875/0_54847 2064 chr2 32916197 36 28735S5=1X3=…… m64032_190606_013330/61472875/0_54847 2064 chr2 32916198 1 18560S8=1D4=…… m64032_190606_013330/61472875/0_54847 2064 chr2 32916199 60 31777S3=1X3=…… m64032_190606_013330/61472875/0_54847 2064 chr2 32916199 55 6563S7=1X4=…… m64032_190606_013330/61472875/0_54847 2064 chr2 32916209 60 27987S5=1X7=……

To Reproduce pbmm2 align /public1/data/heke/pacbio/hg38_noalt.mmi subreads/m64032_190605_101534.subreads.bam /public1/data/heke/pacbio/mapping/test/noalt_aligned.m64032_190605_101534.subreads.bam --preset SUBREAD --median-filter

armintoepfer commented 9 months ago

The best one is the first reported alignment, if you don't sort