amiratag / DataShapley

Data Shapley: Equitable Valuation of Data for Machine Learning
MIT License
256 stars 66 forks source link

Performance graph phenomenon #12

Closed anishprasanna closed 4 years ago

anishprasanna commented 4 years ago

Hi,

Just had a quick inquiry about the results of my performance graph. I am a little confused as to why the accuracy drastically increases at around 75% removed. I saw a similar increase when using logistic regression as well. Any ideas? mygraph

tabularML commented 4 years ago

Hi, what are the number of classes and their balancedness?

anishprasanna commented 4 years ago

Hi,

2 classes, and it’s 75% / 25%.

On Fri, Feb 28, 2020 at 1:59 PM amiratag notifications@github.com wrote:

Hi, what are the number of classes and their balancedness?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/amiratag/DataShapley/issues/12?email_source=notifications&email_token=ANGN6GPDNUFOPO2QRXLTNI3RFFNJNA5CNFSM4KYUI2I2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENJYHQQ#issuecomment-592675778, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANGN6GNBGMEX4U6HYJOO3BLRFFNJNANCNFSM4KYUI2IQ .

tabularML commented 4 years ago

So what's happening is that all data points from the minority class are removed and therefore the trained model is predicting everything as the majority class which means a jump to the classification accuracy of 75. Does that answer your question?

anishprasanna commented 4 years ago

Ok yes. Thanks!

On Mon, Mar 2, 2020 at 12:53 PM amiratag notifications@github.com wrote:

So what's happening is that all data points from the minority class are removed and therefore the trained model is predicting everything as the majority class which means a jump to the classification accuracy of 75. Does that answer your question?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/amiratag/DataShapley/issues/12?email_source=notifications&email_token=ANGN6GJYD4F7KPXWDTRBAW3RFPXARA5CNFSM4KYUI2I2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENQI64I#issuecomment-593530737, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANGN6GKPTZWFLLTBBPMU4A3RFPXARANCNFSM4KYUI2IQ .