The k-mer Weighted Inner Product.
This software implements a de novo, alignment free measure of sample genetic dissimilarity which operates upon raw sequencing reads. It is able to calculate the genetic dissimilarity between samples without any reference genome, and without assembling one.
Full documentation is available at https://kwip.readthedocs.org
See the kWIP
Documentation
kWIP works by decomposing sequencing reads to short
k-mers,
hashing these k-mers and
performing pairwise distance calculation between these sample k-mer hashes. We
use khmer
from the DIB lab, UC Davis to
hash sequencing reads. KWIP
calculates the distance between samples in a
computationally efficient manner, and generates a distance matrix which may be
used by downstream tools. The power of kWIP
comes from the weighting applied
across different hash values, which decreases the effect of erroneous, rare or
over-abundant k-mers while focusing on k-mers which give the most insight into
the similarity of samples.
kWIP is Copyright 2015 Kevin Murray, and released under the GNU General Public License version 3 (or any later version).
A publication describing kWIP has been published in PLOS Computational Biology