adw96 / DivNet

diversity estimation under ecological networks
83 stars 18 forks source link

X covariate table and the intercept #140

Closed AngusBall closed 10 months ago

AngusBall commented 10 months ago

Hello!

Firstly, excellent package and documentation! Secondly, I'm still rather new at coding and more complicated statistics such as this so I hope this isn't a silly question, but I was wondering why the X table has 1's for every value in the intercept column.

Following getting-started.Rmd on line 173 we make the table "X" and on line 177 it says the intercept term is our altered basalts; however when I inspect the X table intercept, it has a 1 in every value (opposed to the other covariate information columns which only have 1s in respective samples). Is this expected behavior and just a function of the fact the intercept is used as a baseline?

Thank you in advance! Angus

svteichman commented 10 months ago

Hi Angus,

I'm glad that you are having a positive experience with DivNet.

The covariate table has 1's for every value in the intercept column because this is convention in fitting statistical models. Usually models will be fit and interpreted as $E[Y_i] = \beta_0 + \beta_1X_i$ where we say that $\beta_0$ is the expected value of $Y_i$ when $X_i$ is equal to $0$, and $\beta_1$ is the expected change in $Y_i$ associated with a one unit increase in $X_i$. In the case of the vignette, we have a categorical covariate, which we represent as a series of dummy variables that are set as $1$ if observation $i$ is in that category and $0$ otherwise. However, one category needs to be chosen as the baseline, and in this case the intercept $beta_0$ is now interpreted as the expected value of $Y_i$ when observation $i$ is in that baseline category, and $\beta_1$ is the expected change in $Y_i$ when the observation is in the category associated with $\beta_1$ instead of the baseline category. Therefore, the covariate table has 1's for every value in the intercept column because we start off with the expected value of $Y_i$ as $1\times \beta_0$ for all observations, and then for observations in which the next covariate is non-zero then we will add on $X_i\times\beta_1$, and so on for the rest of the covariates in the model.

Best, Sarah

AngusBall commented 10 months ago

Ah, that makes sense, thank you!