Closed xuanfei1000 closed 1 year ago
Hi, that result is the ensemble of five different models on a testing set which is hidden on their server and has not been released to the public. The result you have is from the validation data. So these scores are not directly comparable.
Can you please provide the method of ensemble of 5 different models?
Share some resources on how ensemble was done? What was it's AUC score on the Validation Dataset?
I run 05_Optimizing_AUROC_Loss_with_DenseNet121_on_CheXpert.py to test 5 selected classes, but I get the mean AUC is 90.54. Your method(deepAUV-v1) on chexpert offical website reports the AUC is 93.05, but I can't achieve this result. Why is it, how did you get the 93.05 AUC?