leios / SoME_Topics

Collaboration / Topic requests for SoME
Other
212 stars 6 forks source link

Probabilistic graphical models and regression #115

Open stephenhchen opened 2 years ago

stephenhchen commented 2 years ago

About the author

Hi there! I'm Stephen, a practicing data person and aspiring researcher. I really enjoy great explanations for unintuitive concepts in all things math. I love teaching (though I'm not great at it) and have always wanted to create explanatory content in statistics/probability theory. I thought this would be a great place to connect with some passionate and like-minded folks to work together on topics in statistics, specifically related to graphical models and regression.

Quick Summary

For this project, I'd love to collaborate with other domain experts to help develop ideas and scope the project and animators or data viz enthusiasts for production. I'd like to illuminate some theoretical concepts (like d-separation) when studying probabilistic graphical models (PGMs) and demonstrate how biases arise in various regression scenarios as a result of confounders and conditioning.

It'd be great to lead the audience through exploratory real-world and/or synthetic data examples to uncover some of the theory themselves. Showing how regression estimates change if we include certain variables as covariates when estimating causal effects. I'm amenable to the content and hope through collaborating we hone in on specifics.

Ideally, the material would be accessible to students in introductory statistics courses, since they are important for building models and interpreting coefficients. I hope viewers will walk away with understanding the importance of thinking about data generating processes carefully and having a strong conceptual understandings for the why behind some of the theory in PGMs.

Target medium

I'm pretty open to the medium, but maybe it'd be helpful to set one upfront. I'm thinking an interactive article could work well, like a snazzy NYTimes Upshot piece which may require D3 and relevant front-end, but really depends on what folks are comfortable with and what skillsets you all bring. I'm getting some inspiration from this project: https://www.microsoft.com/en-us/research/project/datamations/ though I don't want to pollute anyone's creative ideas. Very happy to be flexible here :)

More details

I haven't found material that's super beginner friendly on PGMs + regression, targeting the beginner-intermediate level. Would appreciate any relevant material if you guys have come across any!

Contact details

Post here or DM me via Discord: Stephenhc#5618

Graham853 commented 2 years ago

This sounds interesting, and I hope you're able to make something. Unfortunately, I am no expert in this area, and certainly can't help with producing content. A few vague suggestions...

From what I remember about Bayesian networks, the 'multiple diagnosis in bipartite graphs' problem is one of the simplest.

Instead of trying to estimate parameters (which is how I interpret regression) it might be easier to illustrate counter-intuitive behaviour by running a bunch of simulations for each of a set of fixed parameter values.

I follow John Baez's blog, and he has a lot of posts on networks, some of which might provide inspiration. See https://math.ucr.edu/home/baez/networks/ for an overview.

stephenhchen commented 2 years ago

Thanks for sharing the resource! Will take your advice on simulating simple scenarios first to build some intuition.