kevlar-dev / kevlar

Reference-free variant discovery in large eukaryotic genomes
https://kevlar.readthedocs.io
MIT License
40 stars 9 forks source link

Dynamic error rate fix #363

Closed standage closed 5 years ago

standage commented 5 years ago

In the process of evaluating whether kevlar was correctly calculating the probability of a k-mer abundance when conditioned on a 0/0 genotype, we reformulated the calculation. We now have the following.

probability of k-mer abundance conditioned on a 0/0 genotype

Here, mu is the mean coverage, epsilon is the sequencing error rate, ka is the observed k-mer abundance (alternate allele), and r is the copy number of the corresponding reference allele in the reference genome. In log space the calculation is:

log of probability of k-mer abundance conditioned on a 0/0 genotype

This formulation changed many of the probabilities in the test suite, but for the most part these were slight changes. There was only 1 substantial change, but it didn't end up affecting the overall likelihood score. Tests on simulated data are pending.

codecov-io commented 5 years ago

Codecov Report

Merging #363 into master will increase coverage by 0.04%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #363      +/-   ##
==========================================
+ Coverage   96.79%   96.83%   +0.04%     
==========================================
  Files          48       48              
  Lines        2899     2902       +3     
  Branches      537      537              
==========================================
+ Hits         2806     2810       +4     
  Misses         58       58              
+ Partials       35       34       -1
Impacted Files Coverage Δ
kevlar/cli/simlike.py 100% <ø> (ø) :arrow_up:
kevlar/simlike.py 99.12% <100%> (+0.46%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0efd11c...e6b5f9c. Read the comment docs.