xKDR / CRRao.jl

MIT License
35 stars 21 forks source link

Testing strategy, some thoughts #62

Closed smishr closed 1 year ago

smishr commented 2 years ago
  1. We need a strategy for testing.
  2. There are 4 models (OLS, logistic regression, poisson regression, negative binomial) and in each case there is 1 possibility which is a frequentist model and then there can be multiple pathways for Bayesian priors.
  3. Okay, let's start at the simplest place, for the simple OLS frequentist case, we will just run all the 11 tests that are under development in the NISTtests package.
  4. For the 3 other frequentist models, we should make atleast four test cases which are run in R and that gives us numerical values for parameter estimates which should be good to 8 places.
  5. And then that leaves the Bayesian estimation. Imagine that for each of the four models, we write four distinct Bayesian priors. This gives 16 cases. For each of these, we write these using R-stan with 100k steps in the MCMC with ample burn-in. Once this is done we should endup getting answers that match to 3 or 4 places. We will manually copy out the parameter estimates from the R-stan and these become the Julia test cases.
  6. Putting together all this, we get (11+(4 x 3)) + (4 x 4) = 39 test cases.
sourish-cmi commented 1 year ago

Frequentist Test

All the tests from GLM.jl should be adopted here for CRRao.jl.

Models Completion
Linear Regression 100%
Logistic Regression 100%
Poisson Regression 100%
NegBinom Regression 100%

Bayesian Test

Models Gauss Prior Ridge Prior Laplace Prior T Prior Cauchy Prior
Linear Regression
Logistic Regression
Poisson Regression
NegBinom Regression
sourish-cmi commented 1 year ago

Bayesian Testing Strategy

The purpose of Bayesian MCMC-based inference is to simulate from the posterior distribution.

Suppose we want to simulate from linear regression with Gaussian prior.

1) We implement the model in stan and simulate 10000 samples using rstan or pystan, and save it as csv file

2) We implement the same model in Turing and simulate 10000 sample using CRRao and Turing.

3) Run Kolmogorov-Smirnov test between CRRao sample and rstan. If the test fail to reject - that means CRRao is doing correct job

@ajaynshah @mousum-github