Wrong probability estimation when learning from a dataset with two samples

wyykak commented 4 years ago

from libsvm.svmutil import * y, x = [1, -1], [[1,0,0], [-1,0,0]] prob = svm_problem(y, x) param = svm_parameter('-t 0 -b 1') model = svm_train(prob, param) p_label, p_acc, p_val = svm_predict(y, x, model,'-b 1') print(p_val)

Output:

Accuracy = 0% (0/2) (classification) [[0.3333333338127693, 0.6666666661872307], [0.6666666661872306, 0.3333333338127694]]

This happens with all the interfaces.

cjlin1 commented 4 years ago

Internally libsvm did 5-fold cv to decide decision values for getting prob models. So if the # of data is too small, it may not be able to give proper prob estimates..

On 2020-04-30 17:49, wyykak wrote:

from libsvm.svmutil import * y, x = [1, -1], [[1,0,0], [-1,0,0]] prob = svm_problem(y, x) param = svm_parameter('-t 0 -b 1') model = svm_train(prob, param) p_label, p_acc, p_val = svm_predict(y, x, model,'-b 1') print(p_val)

Output:

Accuracy = 0% (0/2) (classification) [[0.3333333338127693, 0.6666666661872307], [0.6666666661872306, 0.3333333338127694]]

This happens with all the interfaces.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/cjlin1/libsvm/issues/166", "url": "https://github.com/cjlin1/libsvm/issues/166", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

Links:

[1] https://github.com/cjlin1/libsvm/issues/166 [2] https://github.com/notifications/unsubscribe-auth/ABI3BHU26J3VACN5CY7P5M3RPFCRNANCNFSM4MVLMKIA

wyykak commented 4 years ago

This is very helpful! Thank you for you reply!

cjlin1 / libsvm

Wrong probability estimation when learning from a dataset with two samples #166

Links: