kdm9 / kWIP

De novo estimates of genetic relatedness from next-gen sequencing data
https://kwip.readthedocs.org
GNU General Public License v3.0
45 stars 3 forks source link

kWIP

The k-mer Weighted Inner Product.

This software implements a de novo, alignment free measure of sample genetic dissimilarity which operates upon raw sequencing reads. It is able to calculate the genetic dissimilarity between samples without any reference genome, and without assembling one.

Build Status gplv3+ Documentation Status

Full documentation is available at https://kwip.readthedocs.org

Installation

See the kWIP Documentation

How it works

kWIP works by decomposing sequencing reads to short k-mers, hashing these k-mers and performing pairwise distance calculation between these sample k-mer hashes. We use khmer from the DIB lab, UC Davis to hash sequencing reads. KWIP calculates the distance between samples in a computationally efficient manner, and generates a distance matrix which may be used by downstream tools. The power of kWIP comes from the weighting applied across different hash values, which decreases the effect of erroneous, rare or over-abundant k-mers while focusing on k-mers which give the most insight into the similarity of samples.

License

kWIP is Copyright 2015 Kevin Murray, and released under the GNU General Public License version 3 (or any later version).

Publication

A publication describing kWIP has been published in PLOS Computational Biology