Experiment 1: data drift between model training distribution and PPI inference distribution

Please see the repo wiki for notation and other information.

In this experiment we will have P_ell = P_u, but NOT equal to the initial distribution used to generate training data for f.

Train the ML model from one distribution (example, one gamma distribution) and then do PPI with P_u = P_ell = some other gamma distribution.

Look at how PPI performs as a function of how "far" the models differ.

We want to have several initial distributions (example different gamma distributions) and for each initial distribution, like 100 different PPI distribution for P_ell = P_u.

Do one set of experiments when x is univariate, and y = f(x), for some simple linear function f. Then do one set of experiments when x is a vector, say 10 dimensional, and y = f(x) for some simple linear function f. Then do one set of experiments when x is a vector and y = f(x) for a non-linear function

[x] add informative plots
[x] cover the bias of the resulting estimate
[x] cover the coverage
[x] cover the size of confidence intervals

nicholas-denis / ppi-testing

Experiment 1: data drift between model training distribution and PPI inference distribution #2