scikit-learn-contrib / DESlib

A Python library for dynamic classifier and ensemble selection
BSD 3-Clause "New" or "Revised" License
479 stars 106 forks source link

For A posteriori, why the probability is devided by total distance? #230

Closed jayahm closed 3 years ago

jayahm commented 3 years ago

Hi,

As per code for A posteiori,

        competences_masked = np.ma.sum(masked_preprocessed,
                                       axis=1) / np.ma.sum(masked_dist, axis=1)

Why the probability of correct classification is divided by the distance only?

I saw on the paper, it is divided the total probability multiple by the distance.

Menelau commented 3 years ago

Hello,

It is done in order to weigh the influence of each data point x_j in the region of competence in such a way that the closer ones have more influence in the competence level estimation.

jayahm commented 3 years ago

I see. But, I saw in your paper, the denominator is the product of probability and distance

In your code above, it is only distance.

Or, I miss some information somewhere?

Menelau commented 3 years ago

This confusion is just due to the notation used in the original paper one by Giacinto et al (and re-used by the review paper), which is quite confusing. The denominator is supposed to just filter out the distance of examples that belongs to the reference class (wl). Hence, we used a boolean mask for that in the code.

I suggest you check the explanation of this technique in the book Combining Pattern Classifiers by Ludmila Kuncheva which also presents some examples of how the competence estimates should behave. The notation there since it is much clearer in my opinion and. The steps and examples shown in the book are exactly the ones used for the implementation of this technique in the library.

Em dom., 18 de out. de 2020 às 00:33, jayahm notifications@github.com escreveu:

I see. But, I saw in your paper, the denominator is the product of probability and distance

In your code above, it is only distance.

Or, I miss some information somewhere?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/scikit-learn-contrib/DESlib/issues/230#issuecomment-711115254, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6SFZBPX5ZYVPRT7VSMEBLSLJVYJANCNFSM4SJ4IZTQ .

jayahm commented 3 years ago

Yes, I had double-checked with the original paper before asking you.

I see.

Okay, thank you for the clarification.