Closed tseemann closed 2 years ago
Hi Torsten,
IUPAC codes are treated as the set of nucleotides that they represent; the raw distance is calculated as number of nucleotide differences per site, like so:
certainly different
-----------------------------------------
(certainly different + certainly the same)
A
vs S
would count +1 for the numerator and the denominator
A
vs W
doesn't count anywhere
A
vs A
+1 denominator only, obviously.
W
vs W
doesn't count anywhere
Deletions are ignored as I believe is standard when calculating nucleotide distances. (In the code they are treated just like N
s.)
I will add better documentation and probably some vignettes for usage, starting this week. This thing is being actively developed and I am very open to feature suggestions or pull requests, etc., too.
The
closest
command measures "raw genetic distance".I assume this ignores deletions?
How does it count IUPAC codes? eg. N vs A, or N vs R, or A vs R ?
Any help appreciated.