mehrdadbakhtiari / adVNTR

A tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data
http://advntr.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
41 stars 15 forks source link

Remove unnecessary viterbi on reverse complement #57

Closed Jong-hun-Park closed 2 years ago

Jong-hun-Park commented 2 years ago

Description

This commit fixes two bugs related to reverse complement.

  1. Logging reverse complement sequence in PacBio mode.
  2. Not calling Viterbi on reverse complement sequence.

Details

1. Logging reverse complement sequence in PacBio mode.

Why does it matter?

2. Not calling Viterbi on reverse complement sequence.

If a read is mapped, we already know which strand the read is mapped. Therefore, we don't need to call Viterbi on both the sequence and the reverse complement of it because we know which sequence mapped to the reference (forward strand). For unmapped reads, we try both and save the "forward sequence" that has the higher score.

It's expected that this fix reduces the running time significantly as Viterbi takes most of the running time.