Open Martinsos opened 6 years ago
For degenerate nucleotide alignment, I'm doing the following in python which seems to work. Thanks for implementing the feature. I posted this in case it helps you with any tests, etc.
import edlib
#create a list of tuples with degenerate matches
degenNuc = [("R", "A"), ("R", "G"),
("M", "A"), ("M", "C"),
("W", "A"), ("W", "T"),
("S", "C"), ("S", "G"),
("Y", "C"), ("Y", "T"),
("K", "G"), ("K", "T"),
("V", "A"), ("V", "C"), ("V", "G"),
("H", "A"), ("H", "C"), ("H", "T"),
("D", "A"), ("D", "G"), ("D", "T"),
("B", "C"), ("B", "G"), ("B", "T"),
("N", "G"), ("N", "A"), ("N", "T"), ("N", "C"),
("X", "G"), ("X", "A"), ("X", "T"), ("X", "C")]
FwdPrimer = 'AGTGARTCATCGAATCTTTG'
Seq1 = 'AGTGAGTCATCGAATCTTTG'
Seq2 = 'AGTGAATCATCGAATCTTTG'
Seq3 = 'AGTGACTCATCGAATCTTTG'
seq1_align = edlib.align(FwdPrimer, Seq1, mode="HW", k=2, additionalEqualities=degenNuc)
seq2_align =edlib.align(FwdPrimer, Seq2, mode="HW", k=2, additionalEqualities=degenNuc)
seq3_align =edlib.align(FwdPrimer, Seq3, mode="HW", k=2, additionalEqualities=degenNuc)
>>> print seq1_align
{'editDistance': 0, 'cigar': None, 'locations': [(None, 19)], 'alphabetLength': 5}
>>> print seq2_align
{'editDistance': 0, 'cigar': None, 'locations': [(None, 19)], 'alphabetLength': 5}
>>> print seq3_align
{'editDistance': 1, 'cigar': None, 'locations': [(None, 19)], 'alphabetLength': 5}
Awesome thanks :).
I implemented custom equality but had no time to implement full blown random tests for it. I should upgrade brute force implementation so it can also work with custom equality and then generate many random tests in order to test that custom equality is working correctly.