quantombone / exemplarsvm

Ensemble of Exemplar-SVMs for Object Detection and Beyond
http://www.cs.cmu.edu/~tmalisie/projects/iccv11/index.html
MIT License
430 stars 155 forks source link

Calibration part in exemplar-svm #57

Open madhulikajain opened 12 years ago

madhulikajain commented 12 years ago

I am unable to understand how the boundary of single exemplar classifier is shifted and scaled in calibration. Could you please elaborate?

Thanks

quantombone commented 12 years ago

There are two different types of calibration which can be used:

1) Per-exemplar calibration by fitting a sigmoid. This is a minor variant of "Platt's method" for converting SVM outputs to probabilities. The basic idea is to look at how an exemplar responds to in-class examples. A sigmoid function is fit to the detection scores of the exemplar when applied to a held out set of negatives as well as all other in-class instances (which were not used in the SVM training). This sigmoid maps the raw SVM output scores to a number in the [0,1] range and while it doesn't change the ordering induced by any single exemplar, it produces numbers which are more amenable to comparison. It is better to compare scores such as 0.1 and 0.4 produced by the sigmoid than the raw exemplar-SVM scores. Fitting a sigmoid to the responses amounts to finding the two parameters alpha and beta where 1./(1+exp(-alpha_(w'_x)+beta)) and can be interpreted as "shifting and scaling" the decision boundary of the Exemplar-SVM.

2) Learning an exemplar co-occurrence matrix. This is motivated by the observation that for testing objects which are in a common "canonical" view, many different exemplars will fire. So instead of taking the maximum score, we should combine the scores of multiple exemplars. What happens is that a #exemplars x #exemplars matrix is learned (by simply counting how often two different exemplars co-fire on the training set and they are both correct) which tells the algorithm how to combine scores of different exemplars. While this improves object detection performance, it breaks the semantics of "single exemplar association" because now the detection is a result of multiple exemplars firing. If you want a better object category detector, then this matrix is useful, but if you want to perform label transfer, then Platt's calibration might be better.

madhulikajain commented 12 years ago

Thanks a lot

Madhulika

On Wed, May 9, 2012 at 8:50 PM, Tomasz Malisiewicz < reply@reply.github.com

wrote:

There are two different types of calibration which can be used:

1) Per-exemplar calibration by fitting a sigmoid. This is a minor variant of "Platt's method" for converting SVM outputs to probabilities. The basic idea is to look at how an exemplar responds to in-class examples. A sigmoid function is fit to the detection scores of the exemplar when applied to a held out set of negatives as well as all other in-class instances (which were not used in the SVM training). This sigmoid maps the raw SVM output scores to a number in the [0,1] range and while it doesn't change the ordering induced by any single exemplar, it produces numbers which are more amenable to comparison. It is better to compare scores such as 0.1 and 0.4 produced by the sigmoid than the raw exemplar-SVM scores. Fitting a sigmoid to the responses amounts to finding the two parameters alpha and beta where 1./(1+exp(-alpha_(w'_x)+beta)) and can be interpreted as "shifting and scaling" the decision boundary of the Exemplar-SVM.

2) Learning an exemplar co-occurrence matrix. This is motivated by the observation that for testing objects which are in a common "canonical" view, many different exemplars will fire. So instead of taking the maximum score, we should combine the scores of multiple exemplars. What happens is that a #exemplars x #exemplars matrix is learned (by simply counting how often two different exemplars co-fire on the training set and they are both correct) which tells the algorithm how to combine scores of different exemplars. While this improves object detection performance, it breaks the semantics of "single exemplar association" because now the detection is a result of multiple exemplars firing. If you want a better object category detector, then this matrix is useful, but if you want to perform label transfer, then Platt's calibration might be better.


Reply to this email directly or view it on GitHub: https://github.com/quantombone/exemplarsvm/issues/57#issuecomment-5602619

asundrajas commented 10 years ago

I hope I'm not too late to ask. Quantombone, you state that to calibrate a svm, you have to fit a sigmoid function to the responses of negatives and positives of that specific svm. Did I understand correctly? Then you say you have to find two parameters, alpha and beta, in the function 1/(1+exp(-alpha_(w'_x)+beta)). So, to do that, I put up following equations:

Let S_pos be the set of responses of the positive instances, and S_neg the set of responses of the negative instances. For all xp in Spos: 1 = 1/(1+exp(-alpha(w'_xp)+beta)) For all xn in Sneg: 0 = 1/(1+exp(-alpha(w'_xn)+beta))

Solve this over-determined system and get alpha and beta. Is this correct so far?

Once I have alpha and beta, how do I change w to shift the decision boundary?

quantombone commented 10 years ago

@asundrajas, you have the correct intuition for obtaining alpha and beta. To shift the decision boundary, just create a new classifier with weights w' and offset b', such that applying the classifier to x is done as follows: f(x) = w'*x+b'

So simply put, w' = alpha*w, b' = b.

onzone commented 9 years ago

Sir, I am trying to implement the exemplar svm and trying to modify it. Actually I m stuck in the calibration portion. What I have done is for each positive instance train an exemplar using that positive as positive and negatives are chosen by hard negative mining from all the negative instances. I have used Cp as 0.5 and Cn as 0.01 as suggested by your paper. But when I am checking how well it performs on the validation set, the scores are mostly negative(even for the positive instances) due to training with so many negative examples. So after the calibration method that you've suggested, I am getting such alpha beta that increased my exemplar score very much. As a result in the test set it also classifies the negative ones as positive. Did I do something wrong?

For calibration, I first checked score for the in-class as well as out-class instances and learnt a sigmoid from it. I have also checked only the in-class exemples and based on the overlap score learnt a sigmoid. But still in both cases, the classifiers classifies negative ones as positive ie they give positive response for negative instances also. Can you please tell me, why I am not getting the desired output?

Thanks and regards, Anjan Banerjee

quantombone commented 9 years ago

Hi Anjan,

Before calibration you should expect the scores of the classifier to be negative, even on the positives. What you are observing matches my intuition.

When testing, you shouldn't worry about the score of a single detection score in isolation. The only thing that's important is having the good detections score higher than the bad detections. In other words, the evaluation should be performed on a medium/large dataset and a precision recall curve should be computed.

When using platt's calibration method to adjust the ExemplarSVMs, you'll need to make sure that the raw ExemplarSVMs were trained long enough. During my PhD I observed some funky behavior when training on small datasets -- if the number of negatives is too small then the raw ExemplarSVMs are just too weak.

I won't be able to into any more specific details. Good luck!

onzone commented 9 years ago

Thanks a lot, Sir.

Anjan Banerjee