Bayesian classifiers - Githubissues

ngoodman commented 7 years ago

We want to use some combination of bayesian regression and bayesian NNs (to be determined) as anchor models for the first pyro release. These can be implemented in a way that shares a lot of work, and makes it extensible. The approach is to define a parametrized family of classifiers as as a pytorch Module; usually one would do MLE on this family to train a classifier, instead we will define a "lift" operation that upgrades the parameters to random variables; we then do VI (or other inference) for the posterior over lifted parameters. See the extensive discussion of this approach in #40.

We should implement the lifting operation and then test it on Bayesian regression examples. Then we can try Bayesian NNs.

eb8680 commented 7 years ago

I can implement and test the lift operation, and someone else could do the example models.

jpchen commented 7 years ago

i can do this (both or example) its in line with the other things im doing

jpchen commented 7 years ago

For Bayesian regression, benchmark will be from the Stan ADVI paper: https://arxiv.org/abs/1506.03431

jpchen commented 7 years ago

They simulate their own data for the regression example which isnt in the paper or repo. Going to try to get my hands on it otherwise use a different paper.

jpchen commented 7 years ago

update: so finding a replicable benchmark is turning out to be a less trivial task than i foresaw. essentially i need something that has all three things: data, model, results.

0) Stan paper: No data, contacted them for data (synthesized by them) 1) drugowitsch's paper has data and model, but no real results. they run it on a toy example to show that it 'works'. 2) edward has a regression example with a dataset i can access but dont report posterior results (they use it as a speed test comparison). 3) Gelman's book in Bayesian Data Analysis cites a dataset that i was able to get my hands on, but uses an improper distribution for alpha. After fiddling and discussing with @martinjankowiak, ill need to sample \alpha from truncated distribution to run VI on it. 4) [edit] theres a hierarchical logistic regression in the Stan paper that uses the same data as above.

I will continue hunting for a published result that has all 3 of what i need, but i was thinking in the meantime of doing (2) by modifying their code to use klqp and comparing that with ours. It will give us confidence that we have something that is correct, though not officially in a paper. [edit:] for a published result i can do (4) and compare log predictives against theirs. @ngoodman thoughts?

ngoodman commented 7 years ago

never simple, eh? :)

i think it's acceptable for the performance benchmark to be an available system, rather than a result reported in a paper. so this would suggest taking a model+data example (or a couple) and comparing the log-predictive and run time to stan and/or edward. using simple examples from the book (eg rats) seems like a good idea?

(in some sense i am suggesting extending the scope of this "anchor model" to be comparison tests against stan and/or edward for a single model. it would be nice to put that together in such a way that it's easy to then use the pipeline to do that comparison for other models....)

jpchen commented 7 years ago

status update:

running logistic regression against edward's model

works with small generated data (they use)
went back and forth for larger datasets (see discussion above), using UCI's Covtype
- works in pyro
- WIP - trying to do this in edward/tf
compared by sampling posterior (because im not as familiar with tf/edward), will add log-predictive comparison
did not do speedtest with small data will do with large

jpchen commented 7 years ago

update:

works with large dataset (up to a constant, could be improved with some tuning)
will do speedtest on CPU (their reported numbers are on GPU or 12 CPU)
nan/learning errors using random_module + nn to define the prior rather than sampling directly. have both versions

pyro-ppl / pyro

Bayesian classifiers #80