tlnagy / Crispulator.jl

✂️ Pooled CRISPR screen optimization tool
Other
19 stars 6 forks source link

Add additional quality metrics #25

Closed tlnagy closed 8 years ago

tlnagy commented 8 years ago

Evaluating model performance solely based on AUROCs might not be ideal because they aren't well known outside of the biostats/ML community and our simulated data (like real life examples) are skewed towards having more negatives.

martinkampmann commented 8 years ago

Hm, I haven't thought about precision-recall that way - that sounds interesting to try. However between the two the more intuitive "venn" overlap I think may be the higher priority.

tlnagy commented 8 years ago

Relevant literature on PR curves:

tlnagy commented 8 years ago

The second paper discusses several estimators for the AUPRC and identify the following point estimators as being the most robust

I decided to implement the first one and hacked up a version of their Figure 3:

include("load.jl")
using Distributions
using Gadfly
π = 0.1
test_dists = Array[
     [Normal(0, 1), Normal(1,1)],
     [Beta(2, 5), Beta(5, 2)],
     [Uniform(0, 1), Uniform(0.5, 1.5)]
]
# true precision, recall functions
recall(xs, Y) = 1-cdf(Y, xs)
precision(xs, π, X, Y) = π*recall(xs, Y)./(π*recall(xs, Y) + (1-π)*(1-cdf(X, xs)))

xs = linspace(-10, 10, 1000)
names = ["binormal", "bibeta", "offset uniform"]

plots = Plot[]
for (name, dists) in zip(names, test_dists)
    X = dists[1]
    Y = dists[2]
    classes = [:b, :a]

    layers = []
    push!(layers, layer(x=recall(xs, Y), y=precision(xs, π, X, Y), 
    Geom.line, Theme(line_width=2pt)))
    for i in 1:10
        cat = rand(Categorical([1-π, π]), 500)
        scores = map(rand, dists[cat])    
        _auprc, p, r = auprc(scores, classes[cat], Set([:a]))
        push!(layers, layer(x=r, y=p,Geom.line, 
        Theme(default_color=colorant"#cccccc", highlight_width=0pt)))
    end
    push!(plots, plot(layers..., Coord.cartesian(fixed=true), 
    Guide.ylabel("precision"), Guide.xlabel("recall"),
    Guide.title(name), Guide.yticks(ticks=[0.0, 0.5, 1.0])))
end
draw(SVG(30cm, 10cm), hstack(plots))

test_prc

this shows my sampled PR curves (gray) and the true PR curve (blue). This doesn't test my AUPRC implementation, but I'm going to measure the bias in my implementation of the lower trapezoid estimator like in Figure 4 from the paper