Closed andreas-wilm closed 3 years ago
Our workaround is to simply filter these lines out at the end. The issue lies in the reference genome itself having non-ATCGN characters in a few places.
It looks like this issue can be handled by modifying plp.c to set any reference positions that are not A,C,T or G to N (i.e. just after this line).
ref_base = (ref && pos < ref_len)? ref[pos] : 'N';
Adding this immediately below that line fixes the issue. Is there any way this could be applied as a patch?
if (! (ref_base == 'A' || ref_base == 'C' || ref_base == 'T' || ref_base == 'G' || ref_base == 'N')){
ref_base = 'N';
}
Thank you for the PR
Doesn't make sense and furthermore produces non-ASCII output.
See bug reported by Kostiantyn Dreval and Ryan Morin in LoFreq 2 somatic