jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
58 stars 29 forks source link

JASP does not report Posterior Parameter Distribution of Intercept in a fitted linear regression model #1240

Closed mspezio closed 1 year ago

mspezio commented 3 years ago

Data file is attached, along with the figures. Also, the correctly estimated Posterior Parameter Distribution for the fitted model is attached.

In order to use JASP to teach Introductory Statistics using Bayesian approaches, JASP needs to return the correct Posterior Parameter Distributions for the fitted models, along with the specified designated credible intervals. Otherwise there will be confusion as to why the Classical and Bayesian values are so different. In this case, the "marginal" posterior distribution of the Intercept, assuming that the predictor variable is not included, is correctly estimated and shown. But there is no display of the posterior distribution of the Intercept when it is obtained from the model that includes the predictor variable.

All posterior parameter distributions need to be accessible and displayed in JASP Bayesian linear regression.

Screen Shot 2021-04-06 at 5 34 20 PM Screen Shot 2021-04-06 at 5 32 59 PM parenthood.csv.zip Screen Shot 2021-04-06 at 5 36 03 PM Screen Shot 2021-04-06 at 5 35 55 PM

boutinb commented 3 years ago

@TimKDJ Can you have a look to this issue?

vandenman commented 3 years ago

Otherwise there will be confusion as to why the Classical and Bayesian values are so different. In this case, the "marginal" posterior distribution of the Intercept, assuming that the predictor variable is not included, is correctly estimated and shown. But there is no display of the posterior distribution of the Intercept when it is obtained from the model that includes the predictor variable.

The reason why the results may differ a lot is that the Bayesian linear regression shows model-averaged posterior distributions.

In this case, the "marginal" posterior distribution of the Intercept, assuming that the predictor variable is not included, is correctly estimated and shown. But there is no display of the posterior distribution of the Intercept when it is obtained from the model that includes the predictor variable.

So we take the posterior for the intercept of the model that includes the predictor and the posterior for the intercept of the model that excludes the predictor. Next, we average these distributions weighted by the posterior probability of the respective models.

Plots of posterior distributions for individual models are not possible at the moment. If that is what you would like to see, we can probably add something like this (copied from the Bayesian ANOVAs):

image

where you can use the terms under Components (all non-nuisance variables considered) to do inference for a single model. Would that help?

github-actions[bot] commented 1 year ago

This issue will be automatically closed in 42 days due to inactivity. Feel free to leave a comment if you believe this is still relevant.

github-actions[bot] commented 1 year ago

Automatically closed due to inactivity.

CGMoreh commented 1 month ago

I may have misunderstood the original question, but I've just realised that I have a similar/relatable question if posed not as a plotting issue but as a coefficient-summary one. The coefficients table from a classical and a Bayesian regression seem to differ in that the posterior summary of the intercept coefficient is not included in the results table. In the case of a classical regression output, both the M__ and M_1 intercepts are shown, separately (which is great). But in the posterior summary of a Bayesian regression, only the M_0 Intercept coefficient is shown. This may be confusing for comparative pedagogical purposes (unless I'm completely wrong at interpreting something...). The screenshot below - comparing outputs from two parallel classical/Bayesian models - hopefully highlights what I mean. I have a feeling that the original poster had something similar in mind, maybe?

image

vandenman commented 1 month ago

There are some subtle but important differences between how the frequentist linear regression model and Bayesian linear regression models are designed. In the Bayesian linear regression (done via the R package BAS) the design matrix is always centered, i.e., the mean of each column is set to 0 (except for the intercept column). Thus, the intercept will always model the raw mean of the dependent variable. If you do not center the design matrix, then the interpretation of the intercept depends on the remainder of the predictors of the design matrix. This is not a problem for frequentist linear regression when only one model is considered. However, in Bayesian linear regression, we average across all models and then it becomes a must to use a centered design matrix (otherwise the averaged parameters become hard to interpret).

CGMoreh commented 1 month ago

Thank you for the prompt clarification!