BUG: test_logreg.m provides incorrect class probabilities?

mekman commented 7 years ago

Hi,

I think the test_logreg.m function in core/learn/test_logreg.m provides incorrect probabilities for the class membership.

The offending line (L40):

acts(c,:)=exp(w'*testpats)./(1+exp(w'*testpats));

However, I believe the probabilities should be calculated as:

Here is a fix that calculates correct probabilities:

X = scratchpad.logreg.betas'*testpats; % distance to hyperplane/decision function

% output predictions goes into "prob"
prob = zeros(size(testtargs));

eX = exp(X);
for c = 1:nConds
    prob(c,:) = eX(c,:)./sum(eX,1);
end
scratchpad.prob = prob;

I'm happy to to submit a pull request - if you like?

kind regards, Matthias

Wildcarde commented 7 years ago

Matthias, I don't seem to have gotten a notification about this issue, sorry about that. A pull request would be greatly appreciated, in the mean time I'll see if there's anybody here that can verify what the expected behavior of this tool is vs. the proposed update is going to be so we know it's doing the right thing. -Garrett

Wildcarde commented 7 years ago

Matthias, It's my understanding you discussed this in detail with the project owner directly and it was resolved that there should be some updates to the documentation but no alternations to the code as the intention of the two approaches is actually quite different. I'm going to close this issue out, if you would like to contribute any documentation changes feel free to issue a pull request for the related updates for review. -Garrett

Wildcarde commented 7 years ago

Details of the decision from the PI for anybody that stumbles on this issue entry in the future:

After having had a little while to contemplate this, I think this is more of an issue with our clearly describing what the function is meant to do, as opposed to a bug per se.

Let's say that we have n classes, there are two separate ways that one could approach the logistic regression problem.

Approach #1: one could compute n separate binary logistic regressions, i.e., for each class, what is the probability that this class is present or absent. for this computation, the probabilities of "is class A present" and "is class A absent" should add to 1, but the probabilities of "is class A present", "is class B present" .. "is class n present" should not add up to 1.

Approach #2: one could instead take a multinomial approach and compute, for each class, the probability that this class vs. one of the other classes is present. in this multinomial framing, the probabilities of "is class A present", "is class B present" .. "is class n present" should add up to 1.

Based on your bug report, matthias, I think you were expecting the code to implement approach #2, when in fact it implements approach #1.

We have deliberately opted for approach #1 in our studies because we don't want to assume (at test) that the categories are mutually exclusive. e.g., in jarrod's 2014 nature communications paper, where subjects are mentally juggling faces and scenes in working memory, we don't want to take the presence of scene information as necessarily being indicative of the absence of face information (i.e., we want to allow for the possibility that the two categories will be co-present in the participant's mind). in practice, there are many reasons why it is difficult to get truly independent readouts of category information. for example, if the face and scene regressors are anti-correlated during training (as they usually are, to some extent) then this will induce some anti-correlation during classifier testing. however, the relevant point here is that the use of multinomial logic will exacerbate this problem: in approach #1, if the class-specific likelihoods are independent, then the class probabilities will be independent. however, in approach #2, even if the class-specific likelihoods are independent, the class probabilities will be anti-correlated insofar as they have to sum to 1.

We have sought to indicate that we are using approach #1 in our papers; e.g., here is a quote from the methods section of jarrod lewis-peacock's nature communications paper:

"For the selected scans, one classifier was trained to distinguish between scans corresponding to the active retention of a face and other scans; another classifier was trained to distinguish between scans corresponding to the active retention of a scene and other scans; and a third classifier was trained to distinguish between periods of rest and other scans."

In Summary: while I don't think this is a bug (i.e., the function works as we intended it to work) i agree that it is a legitimately confusing issue, and i am grateful to you for bringing it up. i certainly wouldn't want something thinking that they were implementing approach #2 when they were implementing approach #1. we would welcome any documentation updates that clarify this point.

PrincetonUniversity / princeton-mvpa-toolbox

BUG: test_logreg.m provides incorrect class probabilities? #7