microsoft / EdgeML

This repository provides code for machine learning algorithms for edge devices developed at Microsoft Research India.
Other
1.59k stars 370 forks source link

Question about F1 score for Cifar-10 #234

Closed Yuchong-Geng closed 3 years ago

Yuchong-Geng commented 3 years ago

Hi,

I have a question about using F1 score as metric for cifar-10 instead of AUC. When I use the exemplar parameters to train the model, the AUC can go up to around 76% but the F1 score stays around 20 for the entire epcohs.

Can you please provide some insights into this problem.

I appreciate your time and help.

SachinG007 commented 3 years ago

Hi, This depends upon what you used as a threshold score for F1 score computation. Can you please share more details, so that we can help you out?

Thanks

Yuchong-Geng commented 3 years ago

Thanks for the response.

So I am trying to run the cifar experiment as follow:

python3 main_cifar.py --lamda 1 --radius 8 --lr 0.001 --gamma 1 --ascent_step_size 0.001 --batch_size 256 --epochs 100 --optim 0 --normal_class 0

And instead of using the default "AUC" metric, I am using the F1 score that is defined in the DROCCTrainer.

thresh = np.percentile(scores, 20)
y_pred = np.where(scores >= thresh, 1, 0)
prec, recall, test_metric, _ = precision_recall_fscore_support(
labels, y_pred, average="binary") 
SachinG007 commented 3 years ago

Hi, We kept this threshold for F1 score mainly for tabular experiment results (since previous work in that domain considers F1 score as a metric at a specific threshold). For images, we would suggest you look at AUC scores only since it does not consider just one specific threshold but gives an overall picture from various thresholds. Moreover, this is in line with the previous works on images.

Thanks

SachinG007 commented 3 years ago

Hi @Yuchong-Geng ,

Can I close the issue now?

Yuchong-Geng commented 3 years ago

Sure. Thanks for the help!