JuliaTrustworthyAI / ConformalPrediction.jl

Predictive Uncertainty Quantification through Conformal Prediction for Machine Learning models trained in MLJ.
https://juliatrustworthyai.github.io/ConformalPrediction.jl/
MIT License
135 stars 12 forks source link

[Feature request] Conformal Predictive Distributions #2

Open azev77 opened 1 year ago

azev77 commented 1 year ago

Hi & thanks for this package. I've been waiting for a package for conformal prediction...

Here is some sample code from my test drive which may or may not be useful for docs:

using Pkg
Pkg.add.(["MLJ" "EvoTrees" "Plots"])
Pkg.add(url="https://github.com/pat-alt/ConformalPrediction.jl")
using MLJ, EvoTrees, ConformalPrediction, Plots, Random;
########################################
rng=MersenneTwister(49); #rng=Random.GLOBAL_RNG;
n= 100_000; p=7; σ=0.1;
X = [ones(n) randn(rng, n, p-1)]
θ = randn(rng, p)
y = X * θ .+ σ .* randn(rng, n)
train, calibration, test = partition(eachindex(y), 0.4, 0.4)
########################################
EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees
model = EvoTreeRegressor() 
mach = machine(model, X, y)
fit!(mach, rows=train)
pr_y = predict(mach, rows=test)
########################################
conf_mach = conformal_machine(mach)
calibrate!(conf_mach, selectrows(X, calibration), y[calibration])
pr = predict(conf_mach, X[test,:]; coverage=0.95)

pr_lower = [pr[j][1][2][] for j in 1:length(test)]
pr_upper = [pr[j][2][2][] for j in 1:length(test)]

###########################################
plot()
plot!(y[test], lab="y test")
plot!(pr_y, lab="y prediction")
plot!(pr_lower, lab = "y 95% prediction lower bound")
plot!(pr_upper, lab = "y 95% prediction upper bound")

mean(pr_lower .<= y[test] .<= pr_upper)
azev77 commented 1 year ago

Question: how do I plot the predicted distribution of y at a given x?

#AZ: recover the predicted distribution? 
xt = [X[test[1],:] ;;]'
c_grid = .01:.01:0.99 
LB = []; UB = [];
for j in eachindex(c_grid)
    pr = predict(conf_mach, xt; coverage=c_grid[j] )
    push!(LB, pr[1][1][2][])
    push!(UB, pr[1][2][2][])
end
plot(legend=:topleft)
plot!(LB, 1.0 .- c_grid, lab="LB at %-ile")
plot!(UB, 1.0 .- c_grid, lab="UB at %-ile")
plot!([pr_y[1]], seriestype = :vline, lab="y prediction point estimate", color="red") 

image To be clear, I'm fairly confident that what I plotted above is not the predicted density of y given x. My question is how to recover it...

pat-alt commented 1 year ago

Hi @azev77! Great to see you've already played around with the package. I understand what you have in mind and that would certainly be nice feature to add. It can apparently be done as demonstrated in this paper by @valeman and co-authors, but the package does not support this yet. For now all you can really produce is prediction intervals. Adding support for this in the future would be nice, but it looks too involved for me to do that any time soon. Here's a corresponding tutorial if you want to have a go at it yourself. Or perhaps I'm overthinking this and others know of a straight-forward way to do what you have in mind.

What you have plotted is the user-chosen error rate $\alpha$ as a function of $\hat{y}$ as far as I can tell. I'm not quite sure what to make of this right now, but it is definitely not the predictive posterior $\hat{f}(y|x)$.

azev77 commented 1 year ago

I had a look at the slides:

using MLJ, EvoTrees, ConformalPrediction, Plots, Random, MLJLinearModels, Tables;
########################################
n= 100_000; p=70; σ=100.10;
X = [ones(n) randn(MersenneTwister(49), n, p-1)]
θ = randn(MersenneTwister(49), p)
CEF   = X*θ 
Noise = σ*randn(MersenneTwister(49), n)
y = CEF + Noise
train, calibration, test = partition(eachindex(y), 0.4, 0.4)
########################################
LinearRegressor = @load LinearRegressor pkg=MLJLinearModels
model = LinearRegressor(fit_intercept = false) 
mach = machine(model, Tables.table(X), y)
fit!(mach, rows=train)
pr_y = predict(mach, rows=test)
########################################
conf_mach = conformal_machine(mach)
calibrate!(conf_mach, selectrows(X, calibration), y[calibration])
pr = predict(conf_mach, X[test,:]; coverage=0.95)
pr_lower = [pr[j][1][2][] for j in 1:length(test)]
pr_upper = [pr[j][2][2][] for j in 1:length(test)]
mean(pr_lower .<= y[test] .<= pr_upper)   # 0.94975
###########################################
# recover the predicted distribution
xt = [X[test[1],:] ;;]'
c_grid = .01:.001:0.99 
LB = []; UB = [];
for j in eachindex(c_grid)
    pr = predict(conf_mach, xt; coverage=c_grid[j] )
    push!(LB, pr[1][1][2][])
    push!(UB, pr[1][2][2][])
end
plot(legend=:topleft)
plot!(LB, (1.0 .- c_grid)/2.0, lab="LB, quantile")
plot!(UB, (c_grid[end]/2.0) .+ (c_grid)/2.0, lab="UB, quantile")
plot!([pr_y[1]], seriestype = :vline, lab="y prediction, median", color="red") 

Gives the ECDF (centered at the median) image

Shouldn't the "density" be the derivative of the ECDF?

pat-alt commented 1 year ago

Thanks @azev77 - just linking the related thread on discourse here for info.

valeman commented 1 year ago

Here is more relevant paper that deals with any underlying regressor https://proceedings.mlr.press/v91/vovk18a.html And toy Python package that implements it https://pypi.org/project/pysloth/