thekingofkings commented 8 years ago

In standard regression

How to derive the confidence interval and p-value of coefficients?

The Gaussian distribution assumptions is made. But in more detail, what follows the Gaussian distribution. And why?

In spatial auto-correlation

The spatial auto-correlation has co-linear issue in constructing predictor variables. This violates some assumptions, so it is harder to calculate the p-value and confidence interval.

thekingofkings commented 8 years ago

How to derive the p-value of coefficients in standard LR?

The T-test is used in LR to get the coefficients p-value.

According to Wiki page T-test t = sqrt(p) Z / s, where Z and s are functions of data. Z = (Xbar - u) / (sigma / sqrt(n))

Assumptions:

The following assumptions are made in T-test:

X follows a normal distribution with mean μ and variance σ2
s^2 follows a χ2 distribution with p degrees of freedom under the null hypothesis, where p is a positive constant
Z and s are independent

Test slope of a regression line

t_score = ( beta^hat - beta_0 ) / SE_beta, where beta^hat is learned coefficient, beta_0 is the null hypothesis beta = 0, and SE_beta is the standard error of least-squares estimates.

To apply T-test, then the SE_beta^2 should follow a chi^2 distribution, which implies that sum of squares of residuals has a something to do with Gaussian.

Therefore, the residuals should be independent Gaussian variable.

More reference

An explicit requirements for T-test on linear regression is available here.

The dependent variable Y has a linear relationship to the independent variable X.

For each value of X, the probability distribution of Y has the same standard deviation σ.

For any given value of X,

The Y values are independent.

The Y values are roughly normally distributed (i.e., symmetric and unimodal). A little skewness is ok if the sample size is large.

thekingofkings commented 8 years ago

How to test significance if T-test cannot apply?

Use Monte Carlo test, permutation + cross validation score

refer to the comments for #2

thekingofkings / dissertation

How to derive confidence interval and p-value of coefficients from spatial regression? #1

In standard regression

In spatial auto-correlation

How to derive the p-value of coefficients in standard LR?

Assumptions:

Test slope of a regression line

More reference

How to test significance if T-test cannot apply?