Closed gchers closed 6 years ago
Hey Giovanni, thanks for your input (and the patch)!
I agree that your implementation makes a lot more sense—mine was simply the laziest alternative I could think of at the time... :)
Regarding your third point, you are right in that the current solution does not compute nonconformity scores in a leave-one-out fashion. It is not obvious to me which would be the most reasonable default behavior; on the one hand, computing scores using leave-one-out is strictly better in terms of informational efficiency (though this does depend on the underlying model—and unstable model, e.g., a decision tree, will benefit more than something like a support vector machine), but on the other hand the computational complexity suffers greatly, of course. I suppose that the current implementation is mostly suited towards a very particular case: when the data set is too small for an ICP but too large for a leave-one-out TCP, which should already be handled reasonably well by a well-configured CCP/BCP/ACP...
So, yes, it probably makes more sense to compute the TCP nonconformity scores using leave-one-out by default; although, it might also be reasonable to be able to optionally select the current behavior through a constructor parameter. If you'd like to submit another patch addressing this, I'd be happy to take a look, otherwise I'll look into it as soon as I have the time.
As an aside, your suggestion regarding a return_stats
type of behavior might still be beneficial, even if it is not used for TCP; the current implementation of CCP and BCP is slightly wonky (in the same way as the current TCP implementation, i.e., with regard to the smoothing parameter). It would probably make sense to consolidate the p-value computations into a single implementation (another thing I'll have to look into)—after all, as long as we know n_gt
and n_eq
they're all effectively the same. A single implementation would also make extensions, e.g. interpolation, simpler to implement.
Hi Henrik. Thanks (again) for the library!
I noticed negative p-values when using smoothed TCP. I am currently not able to provide a compact reproducible example of this, but I found the following errors in the implementation.
TCPClassifier
callsICPClassifier.predict()
to compute p-values, but:1) If
self.smoothing=True
,self.base_icp
is also instantiated withself.smoothing
. This is bad, because ICP multipliesn_eq
by the random number in [0,1], but then TCP removes a new random number in [0,1]; this will clearly not obtain the desired result. To fix this particular issue, I would suggest ICPClassifier.predict() to have areturn_stats
boolean option, which if True returns, instead of the p-values,n_eq
andn_gt
, so that the caller can compute the p-values as it pleases. However, this is probably not the best solution (see below).2)
ICPClassifier.predict()
computes p-values for all classes, which are then immediately discarded byTCPClassifier.predict()
. https://github.com/donlnz/nonconformist/blob/9a7004e098d81b54b75e3c4808d704abafa4bd49/nonconformist/cp.py#L137 This is clearly bad computationally, although I don't think it causes any misbehaviour in the algorithm.3) It seems to me that
TCPClassifier.predict()
appendsx[i,:]
totrain_x
, but then also makes a prediction includingx[i,:]
, effectively obtaining two nonconformity scores for such object. The actual algorithm should compute a nonconformity score forz_i
asA(Z \ z_i, z_i)
(i.e., excludingz_i
from the training set). This should matter for small sample sizes.IMHO, I think it'd be better to just implement TCP on its own, rather than calling
ICPClassifier.calibrate()
andICPClassifier.predict()
. I'll send you a PR with a fix suggestion.