marcdotson / conjoint-ensembles

Using clever randomization and ensembling strategies to accommodate multiple data pathologies in conjoint studies.
MIT License
0 stars 1 forks source link

Evaluate the ensemble of hierarchical MNL models #38

Closed marcdotson closed 3 years ago

marcdotson commented 5 years ago
dgmiller commented 5 years ago

@marcdotson Done. See dev/aws branch. The base model is named base_hmnl.stan. Initial results show that the ensemble of hierarchical models is effectively identical to the standard mnl_vanilla.stan model on all simulated data sets. The difference for each is as follows (dataset, performance difference between models)

01 +.00 same
02 -.02 worse
03 +.01 better
04 +.01 better

I think what's going on is that the ensemble of hbmnl models is just too conservative. An ensemble benefits from (over)confident, diverse base models rather than safe, "accurate" models. We could try randomizing over something other than the feature-levels but I seriously think that our next step is trying neural networks such as hierarchical bayesian neural networks. To me, everything we have been doing seems to be pointing in this direction.

marcdotson commented 5 years ago

@dgmiller be careful to not jump to the next thing too quickly. We have preliminary evidence that we should run an ensemble of HMNL. We can't say we've given it a fair shake until we've actually induced "clever" randomization -- which we haven't, not beyond something akin to ANA.

dgmiller commented 5 years ago

@marcdotson

image After some preliminary experimentation, I concede your point. Just for fun, I built and ran a vanilla feedforward neural net, played around with various architectures, and ran it on different data sets. I just wanted to see if it could brute-force predict Y given X. The good news is that in general, it didn't outperform either the standard hmnl model or the ensemble. (Only on a few of the runs did it outperform on certain data sets by a small amount.) Still, thinking about the ensemble as a neural net might help point us toward effective "clever" randomization strategies. I can explain when we meet next.

marcdotson commented 5 years ago

@dgmiller is porting the ensemble code into R.

marcdotson commented 5 years ago

Next steps in evaluating the ensemble of hierarchical MNL models. Note that conjoint.py is the first attempt at randomization, running the ensemble, and saving model output.

  1. @jeff-dotson to simulate data with one or more of our first two data pathologies: ANA and screening.
  2. @jeff-dotson to implement clever randomization strategies. This includes:

58137484-af4d7500-7bef-11e9-81d5-796698787a5e

  1. @marcdotson to confirm estimation for the conjoint ensemble, which includes:
  1. @marcdotson start with loo as the fit statistic for model comparison. Consider other forms of predictive fit as needed.
marcdotson commented 4 years ago

@jeff-dotson pull the latest changes and you're good to work in the randomization section of 03_conjoint-ensemble.R. Remember that so far we've only tried leaving out a single variable. Whatever you do, just make sure it's a modified X that is the result to input into the ensembles section next.

marcdotson commented 4 years ago

@RogerOverNOut when specifying the dimensions of the log likelihood saved for the screening (and eventually ANA) model, create an array with dimensions: number of post-warm-up iterations X number of chains X number of observations.

Since our custom MCMC is probably a single chain, that'll be a number of post-warm-up iterations X number of observations matrix.

RogerOverNOut commented 4 years ago

@marcdotson Thanks Marc...Will do.

RogerOverNOut commented 4 years ago

@jeff-dotson working on the code for the competing models. Screening is easy as we have the code and its just a matter of writing out the data transitions. If you have ANA code to share let me know and I will work it in, otherwise I could work on that early next week. I have significant chunks of time Mon and Tues.

marcdotson commented 4 years ago

@jeff-dotson @RogerOverNOut if you have questions on how to get your randomization and alternative model changes onto GitHub, let me know.

marcdotson commented 4 years ago

@dgmiller where is the code used to create the simulated data sets with the various pathologies?

dgmiller commented 4 years ago

@marcdotson It's in the python code folder. The file is utils.py.

The function is generate_simulated_data( ) which allows you to specify which pathologies from the pathology( ) function are present in the data.

marcdotson commented 4 years ago

Updated next steps in evaluating the ensemble of hierarchical MNL models (with conjoint.py as a reference).

  1. @marcdotson simulating data without any pathologies and with one or more pathologies present, beginning with ANA and screening in 01_simulate_data.R (with generate_simulated_data() from utils.py as a reference).
  2. @marcdotson cleaning real data to match how we've simulated data in 02_clean-data.R.
  3. @jeff-dotson and @marcdotson implementing clever randomization strategies in 03_conjoint-ensemble.R. This includes:

58137484-af4d7500-7bef-11e9-81d5-796698787a5e

  1. @marcdotson confirming estimation for the conjoint ensemble in 03_conjoint-ensemble.R, which includes:
  1. @RogerOverNOut confirming the compatibility of loo for model comparison with the competing models in 04_competing-models.R.
  2. @RogerOverNOut starting with loo as the fit statistic for model comparison in 05_model-comparison.R.
marcdotson commented 4 years ago

Updated next steps:

  1. @marcdotson and @jeff-dotson simulating data without any pathologies and with one or more pathologies present, beginning with ANA and screening in 01_simulate_data.R (@RogerOverNOut has some code to share with pathologies present).
  2. @jeff-dotson prepping real data and inducing randomization in 02_clean-data.R.
  3. @marcdotson implementing randomization through parameter constraints (for single pathologies and pathologies jointly) and running the ensemble in 03_conjoint-ensemble.R, which includes:
  4. @RogerOverNOut using loo as the meta-learner (i.e., Bayesian stacking) in 04_meta-learner.R.
  5. @RogerOverNOut using loo as the fit statistic for model comparison after running competing models in 05_competing-models.R and comparing predictive fit in 06_model-comparison.R.
marcdotson commented 3 years ago

After a lot of work, we are close to an initial evaluation. Some final to dos for this far-too-large, meta-issue:

marcdotson commented 3 years ago

@jeff-dotson @RogerOverNOut a swing and a miss for simulated data with ANA present, at least for predictive fit:

Model LOO Hit Rate Hit Prob
HMNL -2732 0.566 0.446
Ensemble -13.4 0.567 0.402

We're still running the ANA-specific model on the simulated data, but we're obviously missing something. Before we dive in and start trying things, I think it would be wise to identify everything that we could modify and discuss what we should attempt first. Here's the potential changes I see:

Personally and from experience, I think the final three things should be tried first -- essentially everything but simulation changes. This branch is becoming a bit of a monster, so after we have the output from the ANA-specific model, I'd like a minute to finish cleaning up the code so iterating will be a bit easier, merge this branch, and then create separate branches for each of these attempts (i.e., ensemble-tuning and predictive-fit branches).

Thoughts?

marcdotson commented 3 years ago

Closed out the initial evaluation with PR #46.

marcdotson commented 3 years ago

Split tasks across three new branches and issues.