KolmogorovLab / hapdup

Pipeline to convert a haploid assembly into diploid
Other
90 stars 10 forks source link

read length inference bug #23

Closed WenyuLiang closed 1 year ago

WenyuLiang commented 1 year ago

Hi Mikhail, I found a bug in the read length inference step since it doesn't work when M is specified as X or = in cigar string. The original code in (https://github.com/KolmogorovLab/hapdup/blob/main/hapdup/filter_misplaced_alignments.py)line 30 is: for token in re.findall("[\d]{0,}[A-Z]{1}", cigar): but "=" is not captured by this pattern. When I change the code as follows, the program works for token in re.findall("[\d]{0,}[A-Z=]{1}", cigar)

mikolmogorov commented 1 year ago

Thanks for reporting, I will incorporate the fix into the next release!