Open KensingtonSka opened 3 years ago
Hello! 👋 Thanks for opening your first issue here! ❤️ We will try to get back to you soon. 🚴🏽♂️
Does it work with our existing code if you omit the intercept column from the design matrix? IIUC, when including separate predictors for both levels of a binary categorical variable (here, "face" A vs B), it is customary (mandatory?) to remove the intercept term to compensate.
That does work. But it leaves all the parameters referenced to the intercept. Which is really just moving the issue around. It's fine if you've standardised your data because then you're intercept is 0 anyway. But if, for whatever reason, you can't, or don't want to, standardise it then you will lose information when you run the regression. Using the pseudoinverse skirts this issue. To quote the LIMO paper:
Although the design matrices made up by LIMO EEG are almost always rank deficient (each condition is coded in one column of X), F or T tests are exact, that is, they give identical results to that obtained by applying a standard inverse to a full rank matrix.
So, should I take it that the reason for using linalg.lstsq() over the Moore-Penrose pseudoinverse is that it's a matter of preference?
I am inclined to switch a pinv call instead of lstsq following what nilearn does:
https://github.com/nilearn/nilearn/blob/main/nilearn/glm/regression.py#L117
it's centering the data that makes the intercept 0. I am sure to follow. You don't want to center the data?
pinv allows you to work with rank deficient design matrices and apparently it's the default in standard GLM code so I think we should switch to pinv.
That does work. But it leaves all the parameters referenced to the intercept. Which is really just moving the issue around. It's fine if you've standardised your data because then you're intercept is 0 anyway. But if, for whatever reason, you can't, or don't want to, standardise it then you will lose information when you run the regression.
I don't think this is correct. If you have this design matrix:
face_a face_b intercept
0 1 1
0 1 1
0 1 1
1 0 1
1 0 1
1 0 1
then you can only estimate 2 parameters, because the third will be a linear combination of the other two. If you discard intercept
and estimate face_a
and face_b
then what you get is not "referenced to the intercept" as you say (that happens when you include the intercept but drop either face_a
or face_b
, in which case the face_*
term will be an offset for that condition relative to the intercept). I don't think this has anything to do with whether the data are standardised, and I don't think any information is lost.
@agramfort
You don't want to center the data?
I'm using the linear regression as a part of a pipeline of sorts that I've been asked to make. I not that I don't want to center the data, and more that I don't want to make any assumptions about the data. Opting to leave the centering to the user.
@drammock That's a very good point. I think I'm doing a bad job explaining 😕, I'll try again more formally. Also, apologies for the png's. I'm not super savvy with markdown yet so I opted to use latex and just screenshot my equations.
So, we have a linear regression model y = β X + ε where
I want to know the values of b_1, b_2, and b_3 such that
However this can't be solved by linalg.lstsq()
because X has no inverse. This is a consequence of x_{i,3} = x_{i,1} + x_{i,2} (i.e. is a linear combination). To avoid this, one can take the route where one removes a column from X such that y = β' X' + ε' where
Here, X' is invertible and can be used to obtain values for b'_1 and b'_2 via:
But, I want b_1, b_2, and b_3 so I'll map equation 2 to our new space using x_{i,3} = x'_{i,1} + x'_{i,2}, x{i,1} = x'\{i,1}, and x_{i,1} = x'_{i,1}. This yields:
Comparing this with equation 4, and making the assumption that both models fit the data equally well such that ε = ε', we can infer that b'_1 = b_1 + b_3 and b'_2 = b_2 + b_3. This addition of b_3 to our fitting parameters is what I mean by "referenced to the intercept." To my understanding, it's not possible to retrieve b_1, b_2, and b_3 from b'_1 and b'_2 which is what I mean by information being lost. I think I was being too loose with my jargon 😀.
ok. well, no objection to using pinv
anyway, if nilearn does it and @agramfort is happy.
@JoseAlanis please see
Hello,
My colleagues and I were trying to recreate the results of the LIMO dataset using MNE. Currently there is a tutorial on this on the MNE website. However, the tutorial doesn't separately generate beta parameters for
face a
andface b
as LIMO does in their example. Instead, the beta reflectingface a - face b
is used in the MNE tutorial. This is becauselinear_regression()
function (here) used in the MNE implementation of LIMO computes the solution to the least-squares equation, y = XB + e, via scipy's linear algebra least squares functionlinalg.lstsq()
which inverts X in order to compute B. This is different to LIMO's implementation which uses a Moore-Penrose pseudoinverse to compute the beta parameters in order to avoid singular matrix issues whichlinalg.lstsq()
cannot handle. At least, this is our understanding.Which you can see if you try to produce beta parameters for
face a
andface b
individually via (this part of the tutorial)resulting in
To skirt this issue we created our own versions of the MNE functions
linear_regression()
and_fit_lm()
(again, here) shown below:Our question is this: is there a reason for using
linalg.lstsq()
over the Moore-Penrose pseudoinverse?If not, We're happy to make a push request. But I'm not sure what would be the best implementation. I'd imagine something like adding an optional input to
linear_regression()
which chooses one method over the other.