RaviSoji / plda

Probabilistic Linear Discriminant Analysis & classification, written in Python.
https://ravisoji.com
Apache License 2.0
128 stars 31 forks source link

predict_proba (calculate probabilities of each class given a sample) #58

Closed ashkanmradi closed 3 years ago

ashkanmradi commented 3 years ago

Hello, I need to have probabilities for each class given a test sample. As in Scikit-learn, we can use .predict_proba() method. I'd appreciate if you help me with this. Is something like .predict_proba() exists in your repository or do i need to calculate probabilities based on scores return in tuple from your .predict() method? Thanks

RaviSoji commented 3 years ago

Check out cell number 7 in the MNIST demo where I give an example of how to classify data points.

I think you are looking for the following method and keyword argument: Classifer.predict(data, normalize_logps=True).

Good luck! Ravi B. Sojitra

ashkanmradi commented 3 years ago

Actually as I said in my first question, the method Classifer.predict(data, normalize_logps=True) returns most probable class and log_p scores in a tuple. I need probabilities for each class which sum up to one for each sample as in scikit-learn for lda classifier. I hoped you could help me with that. anyway, I think you didn't implement that part. Thanks for your time

RaviSoji commented 2 years ago

I think I did implement it? See lines 165-171 in classifer.py: https://github.com/RaviSoji/plda/blob/master/plda/classifier.py. If you set normalize_logps=True, it returns normalized densities in log space to reduce floating point error issues with larger datasets. This was an intentional decision so the user can exponentiate by choice.

        if normalize_logps:
            norms = logsumexp(logpps_by_category, axis=-1)
            logps = logpps_by_category - norms[..., None]
        else:
            logps = logpps_by_category

        return logps, np.asarray(K)

Ravi B. Sojitra