kWIP

The k-mer Weighted Inner Product.

This software implements a de novo, alignment free measure of sample genetic dissimilarity which operates upon raw sequencing reads. It is able to calculate the genetic dissimilarity between samples without any reference genome, and without assembling one.

Full documentation is available at https://kwip.readthedocs.org

Installation

See the kWIP Documentation

How it works

kWIP works by decomposing sequencing reads to short k-mers, hashing these k-mers and performing pairwise distance calculation between these sample k-mer hashes. We use khmer from the DIB lab, UC Davis to hash sequencing reads. KWIP calculates the distance between samples in a computationally efficient manner, and generates a distance matrix which may be used by downstream tools. The power of kWIP comes from the weighting applied across different hash values, which decreases the effect of erroneous, rare or over-abundant k-mers while focusing on k-mers which give the most insight into the similarity of samples.

License

Publication

A publication describing kWIP has been published in PLOS Computational Biology

kdm9 / kWIP

readme

kWIP

Installation

How it works

License

Publication