edeno / Jadhav-2016-Data-Analysis

Code for analysis on data from Jadhav et al. 2016
GNU General Public License v3.0
2 stars 5 forks source link

Handle perfect predictors in GLM fitting #66

Closed edeno closed 7 years ago

edeno commented 7 years ago

The SVD fails to converge in the GLM fitting code of the statsmodel pacakge.

The problem is (from here):

Perfect or total multicollinearity occurs when a predictor of the design matrix is a linear function of one or more other predictors, i.e. when predictors are linearly dependent on each other. While in this case solutions for the GLM system of equations still exist, there is no unique solution for the beta values. From a mathematical perspective of the GLM, the square matrix X'X becomes singular, i.e. it looses (at least) one dimension, and is no longer invertible in case that X exhibits perfect multicollinearity.

The maximum likelihood tries to push the coefficient toward either positive or negative infinity, but it will never reach infinity in the IRLS procedure. So when the SVD fails in this case, we can just set the coefficients for those values to zero and keep the other coefficients.

The problem currently is that the statsmodel package we are currently using doesn't return the coefficients when the SVD fails to converge (unlike in Matlab). So the solutions are:

A final consideration is that we may soon switch to the clusterless version of the ripple decoding, in which case this won't matter. Maybe this wouldn't be worth the effort of implementing.

Right now, I'm going to proceed with ignoring these neurons until the rest of the parts of the spectral analysis are completed.

edeno commented 7 years ago

Using clusterless decoding so not an issue currently. Closing for now.