amiratag / DataShapley

Data Shapley: Equitable Valuation of Data for Machine Learning
MIT License
256 stars 66 forks source link

Request for implementation of data shapley for image processing #5

Closed CHAITANYAI0 closed 5 years ago

CHAITANYAI0 commented 5 years ago

Page 10 of the paper mentions the application of Shapley in veterinary data modelling using DeepTag i.e. Citation 25. Another application of the paper talks about the Fairness in gender detection using image data processing. May I have the glance of the text and image processing techniques used?

Also, Does the Data Shapley paper hold good for skewed data?

tabularML commented 5 years ago

Hi, for both data modalities, as mentioned in the paper text, we feed them through SOTA pretrained classifiers and take the model's prelogit layer representation. For example, for the skin classification experiment, we use an Inceptoin-V3 model pretrained on ImageNet and pass skin images through the network. We use the Inception-V3's default preprocessing method (and the same for other experiments). I didn't understand your final question. Could you please elaborate on it?

CHAITANYAI0 commented 5 years ago

Hi Amirata,

Thanks for the response. By skewed data, I mean imbalanced datasets. For eg., if there are 200 0's in the target variable and 10000 1's. Will the Shapley algorithm work as expected for such datasets?

tabularML commented 5 years ago

Hi, If by working we mean satisfying the equitability properties, the properties are guaranteed by the theory (up to the approximation error).