ExaScience / elprep

elPrep: a high-performance tool for analyzing sequence alignment/map files in sequencing pipelines.
Other
287 stars 40 forks source link

Added a filter to output perfectly mapping reads only (soft-clipping ok) #8

Closed leonorpalmeira closed 7 years ago

leonorpalmeira commented 7 years ago

The following option was added :

--output-exact-mapping-reads

It calls a new filter named OutputExactMappingReads which checks the CIGAR string. In the CIGAR string, only matching bases ['M'] and soft-clipped bases are allowed ['S']. More precisely, this is achieved by forbidding all other characters (any in ['IDNHPX=']).