Closed yondonfu closed 5 years ago
A comparison study has been carried away to evaluate the ability of thresholds based in QoE metrics to serve as binary classifiers of tampered videos. The values of SSIM and PSNR have been computed for a subset of the current dataset (a sample subset of 10K). In the current state the verifier uses a One Class Support Vector Machine (OCSVM) trained with five features extracted from the frames of the videos. The accuracy of the OCSVM is studied in depth in another Jupyter notebook in the repository. In order to evaluate similar accuracy metrics and make comparisons another notebook has been created. For the sake of illustration, lower boundaries has been chosen arbitrarily for SSIM at 0.9 and 35 for PSNR. In order to linearize the value of PSNR the Mean Squared Error (MSE)has been cleared from the PSNR equation: MSE values above 15 have been chosen as the limit for classifying a video as tampered.
Table below summarizes the results obtained with the different methods using the aforementioned thresholds:
Model | f20 | area | tnr | tpr_train | tpr_test |
---|---|---|---|---|---|
SSIM | 0.689931 | 0.667398 | 0.613865 | 0.698690 | 0.720930 |
PSNR | 0.600417 | 0.654298 | 0.686503 | 0.655022 | 0.622093 |
MSE | 0.481304 | 0.628504 | 0.762822 | 0.509461 | 0.494186 |
OCSVM | 0.982801 | 0.925951 | 0.851902 | 0.991266 | 1.000000 |
In the charts below the comparison of these QoE metrics against the value given to the decision function by the OCSVM can be observed. Valid assets are pictured in green, whereas tampered videos are presented in red:
Regardless of the location of the threshold line in the vertical axis for all three QoE metrics (SSIM, PSNR, MSE), it appears evident that a large number of assets are indistinguishable from each other.
However, the horizontal axis marks a very well defined boundary and 98.9% of the green colored assets lie at the right side of the origin (Decision function = 0). The number of tampered assets in this side corresponds to the 14.71% of the False Positive Rate. It is remarkable how there is another strong lower boundary in the negative side of the x axis, beyond which attacks are assigned very similar values. This lower boundary emerges from the particular characteristics of the features of the dataset, and is variable.
Creating such a chart would visualize the relationship (or lack thereof) between tamper scores produced by the verifier and QoE metrics
Hopefully this makes it more evident that neither QoE metrics can be safely considered for tamper classification nor there is any evident form of correlation between them and the output values of the OCSVM.
It could be helpful to create a chart that compares tamper scores from the verifier (the negative distance from the decision boundary) with QoE metric values (i.e. SSIM, PSNR, VMAF) for tampered videos. IIUC we cannot rely on QoE metric values to indicate tampered videos (because most QoE metrics assume well encoded non-tampered videos) - the values might not be that low for tampered videos. An example is a video with a giant watermark - the verifier could produce a very negative score indicating tampering/distortion and the SSIM value might not reflect poor quality.
Creating such a chart would visualize the relationship (or lack thereof) between tamper scores produced by the verifier and QoE metrics