Accuracy filter is applied for HiFi reads and disabled for Illumina short reads for now.
The filter applies more strict criteria on recruiting reads for genotyping based on
1) increased minimum flanking read length
2) lower mismatch in flanking region
3) increased number of supporting reads for each repeat count to be considered before computing the genotype
This commit also fixs minor bugs in
1) vntr_finder.py (line 410) for an error for genotyping VNTRs causing a fatal error when sequence of a read is None.
2) pairwise_aln_generator.py (line 351) for an error when writing pairwise_aln files resulting in empty files. The script reads log file and assumes that after finding all the relevant lines, the remaining lines are only read sequences. When trying to parse the read sequence for an unrelated (and not-parsed) line, it fails. I added a condition to make sure the line in question is in fact a read sequence and not a line that is not processed.
Accuracy filter is applied for HiFi reads and disabled for Illumina short reads for now. The filter applies more strict criteria on recruiting reads for genotyping based on 1) increased minimum flanking read length 2) lower mismatch in flanking region 3) increased number of supporting reads for each repeat count to be considered before computing the genotype
This commit also fixs minor bugs in 1) vntr_finder.py (line 410) for an error for genotyping VNTRs causing a fatal error when sequence of a read is None. 2) pairwise_aln_generator.py (line 351) for an error when writing pairwise_aln files resulting in empty files. The script reads log file and assumes that after finding all the relevant lines, the remaining lines are only read sequences. When trying to parse the read sequence for an unrelated (and not-parsed) line, it fails. I added a condition to make sure the line in question is in fact a read sequence and not a line that is not processed.