Regarding interpretation of F1 and AUC metrics

anirbala98 commented 7 months ago

Hi, I have a general doubt regarding the interpretation of evaluation metrics for IML problems. Are the pixel-wise reported metrics in IML papers calculated only for the 'tampered' class? The IML-ViT paper has a pixel-wise F1 score of 0.836 on Columbia dataset. Does this mean that the F1 score for the 'tampered' class of pixels is 0.836?

SunnyHaze commented 7 months ago

HI! Thanks for your attention! We compute the pixel-level F1 score in the following manners:

We regard the manipulated region as 'positive' (white) and the authentic region as negative (black).
Since the F1 score for an authentic image is meaningless (see this comment: https://github.com/SunnyHaze/IML-ViT/issues/4#issuecomment-1863998642), we only compute the F1 score of all manipulated images of each dataset.
We compute each pixel-level F1 score for a single image in a dataset and take the average value of them as the final F1 score for the dataset.

You can find the detailed implementation here: https://github.com/SunnyHaze/IML-ViT/blob/3ffd03db8b95824ce0b67c55ee1628ec106a6666/engine_train.py#L113

Hope this solves your problem, if you have further questions, please let me know.

anirbala98 commented 7 months ago

Got it, thanks.

anirbala98 commented 7 months ago

Hi, Thanks once again for clearing my doubt on the F1 score. May I know how you compute the AUC metric? I am not able to find it in the code.

SunnyHaze commented 7 months ago

Honestly, the AUC script is done by one of my partners, but he is busy doing his next project. Right now I can give you an example(draft) function to implement the AUC metric. We use the function roc_auc_score imported from sklearn to compute AUC. The script for AUC may be officially released later after we have time to clear up our code.

def cal_precise_AUC_with_shape(predict, target, shape):
    predict2 = predict[0][0][:shape[0][0], :shape[0][1]]
    target2 = target[0][0][:shape[0][0], :shape[0][1]]
    # flat to single dimension fit the requirements of the sklearn 
    predict3 = predict2.reshape(-1).cpu()
    target3 = target2.reshape(-1).cpu()
    # -----visualize roc curve-----
    fpr, tpr, thresholds = roc_curve(target3, predict3, pos_label=1)
    plt.plot(fpr, tpr)
    plt.xlim([0.0, 1.0])
    plt.ylim([0.0, 1.05])
    plt.savefig("./appro2.png")
    # ------------------------------
    AUC = roc_auc_score(target3, predict3)
    return AUC

Hope this solves your issue, if you have further questions, please let me know. If you like our project, you can star it to encourage us.

anirbala98 commented 7 months ago

Thank you for the reply. I will take a look at the sklearn implementation. I have given a star as well.

SunnyHaze commented 7 months ago

Thank you very much! Note that our IML-ViT implementation must crop the zero-padding region before calculating the AUC, i.e. the shape in the example function I gave is for this purpose.

    # this two line is for cropping the actually region of the image
    predict2 = predict[0][0][:shape[0][0], :shape[0][1]]
    target2 = target[0][0][:shape[0][0], :shape[0][1]]

I mention this since a previous issue made this mistake and got an extremely low AUC score. So please be careful.

anirbala98 commented 7 months ago

Noted. Thanks for the information.

SunnyHaze / IML-ViT

Regarding interpretation of F1 and AUC metrics #8