Open IItaly opened 3 years ago
It seems that it is calculated after taking the mean value, so I have no doubt.
Hey @IItaly ,
if you are referring to the Analyze results notebook you are right, in the compute_metrics
function before computing the ROC curve we should apply a sigmoid to normalize the scores.
We do this in the Analyze results net fusion notebook, where we report the results presented in the paper.
If you look at cell 23, we compute the ROC curve after we apply a sigmoid (the expit
function) for normalizing the scores between 0 and 1.
Thank you for pointing out, I'll fix this in the next committ :)
Bests,
Edoardo
P.s. In any case, please be aware that it is not strictly necessary to have scores normalized between 0-1, as the ROC curve from sklearn will automatically find the appropriate thresholds independently from the numeric range of the scores. It is fairer though, as this is the way we presented results in the paper, so we will fix this as soon as possible :)
Thank you for your reply. Maybe that's why I got a higher score than that reported in your paper?
It seems that when calculating the score of video level, the average value is taken here, and the score will be normalized between 0-1.Is this inaccurate calculation of AUC?
df_videos = df_frames[['video','label','score']].groupby('video').mean() df_videos['label'] = df_videos['label'].astype(np.bool) results_video_list.append(compute_metrics(df_videos,train_model_tag))
Is it right to put it between 0 and 1?
Looking forward to your reply
Hey @IItaly ,
Thank you for your reply. Maybe that's why I got a higher score than that reported in your paper?
it might be, did you manage to re-run the pipeline? In #47 we found out that you used a higher number of iterations wrt those used in the paper right? Unfortunately from our side we had a very busy period in the lab and we didn't find the time to re-run the experiments, sorry :(
It seems that when calculating the score of video level, the average value is taken here, and the score will be normalized between 0-1.Is this inaccurate calculation of AUC? df_videos = df_frames[['video','label','score']].groupby('video').mean() df_videos['label'] = df_videos['label'].astype(np.bool) results_video_list.append(compute_metrics(df_videos,train_model_tag))
We compute the average of the non-sigmoid scores for all frames. That will be our raw score for the video, that then must be normalized between 0 and 1 for computing the ROC curve. Is that what you were asking?
Is it right to put it between 0 and 1?
Instead of looking for where the score is > or < than 0, you can directly compute the normalized score with something along this line df_videos['norm_score'] = df_videos['score'].apply(expit)
. Not sure about the syntax, anyway I suggest you to use the apply or map function of Pandas.
Hope these answers clarify things up!
Bests,
Edoardo
The score is still higher while 20000 iterations. So I guess maybe that is the reson.
We compute the average of the non-sigmoid scores for all frames. That will be our raw score for the video, that then must be normalized between 0 and 1 for computing the ROC curve. Is that what you were asking?
Yes.I want to get the correct AUC value.
Instead of looking for where the score is > or < than 0, you can directly compute the normalized score with something along this line df_videos['norm_score'] = df_videos['score'].apply(expit). Not sure about the syntax, anyway I suggest you to use the apply or map function of Pandas. Hope these answers clarify things up!
I did it like this pic, but the final value didn't change
Yes.I want to get the correct AUC value.
Then yes, you should mean the score for all frames for each video, and the normalize between 0 and 1 with the expit function (or any sigmoid function you prefer). If you look at cell 23 of the notebook I showed you in the previous comment, you could compute the AUC in a way like this:
df = df.groupby('video')
df = df.mean()
results_df['loss'] = log_loss(df['label'], expit(np.mean(df['score'], axis=1)))
results_df['auc'] = M.roc_auc_score(df['label'], expit(np.mean(df['score'], axis=1)))
I did it like this pic, but the final value didn't change
I'm sorry I'm not sure I understood, computing the score the way you did in the picture you obtained a similar value to the one you had without normalizing?
I'm sorry I'm not sure I understood, computing the score the way you did in the picture you obtained a similar value to the one you had without normalizing?
Yes, you're right. I did as shown in the picture, but the final value didn't change. I think I can try it your way.
Sorry, I don't quite understand the role of np. mean() here, but it seems that it can't be used because it will show:
ValueError: No axis named 1 for object type Series
Hey @IItaly ,
you're right, you shouldn't need it.
Just try doing expit(df['score'])
as you should already have the mean of scores for each vide thanks to the group by above.
The result of not using expit () is the same as that of using it. Maybe there's no need to change score between 0-1? Thank you for your timely reply, which has solved many of my problems.
Hey,I want to know what is accbal?
Hey @IItaly ,
sorry for the late reply. I'll try to address each question separately.
The result of not using expit () is the same as that of using it. Maybe there's no need to change score between 0-1?
From an high level perspective, the results should not be too dissimilar, as the sigmoid function simply normalizes the scores on a scale between 0-1, but then if the network behaves well we should be able to see a clear distinction between FAKE and REAL raw scores ( = not normalized with sigmoid). Also, it's OK if with 20000 iterations your results are different from ours, as each training of a network is not a deterministic process, therefore repeating training with the same configurations might still provide slightly different results. Therefore, can you report the values you have found with and without the sigmoid normalization with 20000 iterations? I am curios to see how much different are from ours.
Hey,I want to know what is accbal?
It's the balanced accuracy from scikit-learn, you can find the explanation here https://scikit-learn.org/stable/modules/generated/sklearn.metrics.balanced_accuracy_score.html . Bests,
Edoardo
Thank you very much. I'll sort out my results for you to compare~
Are you also doing research on deepfakes?Italy
Hi, @CrohnEngineer I found that the way to calculate AUC value is a bit strange. The score is used here, and the score is a number that has not changed, some of which are greater than 1. Is this the right way to use this function?
rocauc = M.roc_auc_score(df_res['label'],df_res['score'])