cnellington / Contextualized

An SKLearn-style toolbox for estimating and analyzing models, distributions, and functions with context-specific parameters.
http://contextualized.ml/
GNU General Public License v3.0
65 stars 9 forks source link

Easy demo #230

Closed holl- closed 5 months ago

holl- commented 7 months ago

I'm trying to understand your easy demo but am having some trouble. Could you please add more documentation to explain what the code is doing? For example, clearly state your problem in the beginning. What variables are we actually interested in?

Could you also highlight why using your library is easier than doing this with Scikit learn or PyTorch or similar libraries? Couldn't I just pass (C,X) to a regular model and train it in supervised fashion? What does your library do that I couldn't easily achieve otherwise?

You introduce ε in the equation but I don't see this implemented in the code. The other variables are not explained either.

cnellington commented 7 months ago

Hi @holl- thanks for the comment. The goal of Contextualized is to recover context-dependent models from data. The "Death to Cluster Models" demo digs a little deeper than the demo you mention, and directly compares Contextualized regression against SKLearn regression and simple PyTorch neural networks applied to the same problem.

In general though, traditional statistical models do not account for context-dependent parameters and cannot generalize to new contexts. Neural networks are also insufficient on their own, and do not explicitly reveal a context-specific data distribution. Contextualized models recover context-specific models and generalize to new contexts.

Thanks for the catch about $\epsilon$, we should define this. I will update the demo soon to clarify. $\epsilon$ is a latent centered noise variable in linear regression and can be ignored when predicting the expected value of Y given X. We take our notation from https://en.wikipedia.org/wiki/Linear_regression

holl- commented 7 months ago

@cnellington Right, the other notebook is more detailed. However, the way your documentation is set up, most people will first look at the easy demo. It would be very helpful if there were some more comments explaining what is going on and linking to further documentation.

cnellington commented 5 months ago

@holl- I believe the documentation ordering and the descriptions for motivation and use are addressed by the recent PR above. Let us know if you have any other comments.

holl- commented 5 months ago

@cnellington Thanks for the update! I think it's a big step in the right direction.

Before I close the Issue, could you add the appropate documentation links to the easy_regression.ipynb notebook? Here are my suggestions:

pescap commented 5 months ago

ontexts. Neural networks are also insufficient on their own, and do not explicitly reveal a context-specific data distribution. Contextualized models recover context-specific models and generalize to new contexts.

+1 from me, as a beginner, when I started trying the tutorials and reading the documentation.

Actually, one of my first questions was: why not use a neural network with the context as features (+shap). I hope this gives more importance to these kinds of questions.

cnellington commented 5 months ago

@holl- @pescap Two recent changes on

  1. The regression tutorial: https://contextualized.ml/docs/models/easy_regression.html
  2. The under-the-hood page, explaining similarities and differences with post-hoc interpretability methods like SHAP and linking to an exploration of these similarities: https://contextualized.ml/docs/under-the-hood.html

Let me know if there is anything else you'd like us to address, but I believe this addresses all outstanding comments.

holl- commented 5 months ago

Looks good to me 👍