pymc-devs / pymc-examples

Examples of PyMC models, including a library of Jupyter notebooks.
https://www.pymc.io/projects/examples/en/latest/
MIT License
269 stars 234 forks source link

Call for notebooks demonstrating how to handle missing data #461

Closed drbenvincent closed 1 year ago

drbenvincent commented 1 year ago

While we do have a lot of example notebooks, we have a distinct lack of examples covering how to deal with missing data. There is also no missing data tag.

The only ones I can think of are the notebooks on censored and truncated data, which are a form of missing data.

So this is a kind of meta-issue. I/We would be very grateful if people would like to contribute notebooks demonstrating how to handle missing data. Feel free to create specific notebook proposal issues, referencing this issue.

NathanielF commented 1 year ago

I think this is an interesting issue, but not one I know a tonne about...I've heard good things about "Applied Missing Data" by Chris Enders though. Might be able to look into this a bit more after I finish out the Bayesian VAR model thing.

NathanielF commented 1 year ago

Ok, i've ordered the Enders book - arriving on Friday. I will look into this topic in a bit more detail over Christmas and report back in January if i think i can add anything of interest.

NathanielF commented 1 year ago

Ok, I think this is definitely something I want to pursue. Think there is a really nice example of workplace empowerment estimation I want to work through.... Will outline a full proposal after I've finished the reliability and prediction pull request if that's alright?

NathanielF commented 1 year ago

Started some work on this and was able to get FIML and Bayesian imputation working for the multivariate normal. But i had to use a Potential rather than a likelihood as per the discussion here: https://discourse.pymc.io/t/automatic-imputation-of-multivariate-models/11029/3 for the Bayesian MV imputation.

I'm also going to try the chained equation imputation approach which shouldn't need this approach.

juanitorduz commented 1 year ago

Coo! By the way have you seen this video https://www.youtube.com/watch?v=nJ3XefApED0 ?

NathanielF commented 1 year ago

About a 1/3 of the way through that video

reshamas commented 1 year ago

@NathanielF

That video (https://www.youtube.com/watch?v=nJ3XefApED0) needs timestamps, in case you are interested. More info here: https://github.com/pymc-devs/video-timestamps/issues/11

NathanielF commented 1 year ago

Thanks @reshamas , will have a look tomorrow

NathanielF commented 1 year ago

I think this is close to done. Really impressed by those jax samplers!! The speed is so much better!

NathanielF commented 1 year ago

Woop!! Thanks so much @drbenvincent. This one was a real fun one!!