Closed yondonfu closed 4 years ago
On the matter of this topic. There was a metric we designed that consisted on computing the histogram of colors of both reference and rendition. The chi-squared distance between the both histograms was computed per pair of frames. Then it was averaged over time as the rest of the metrics.
I guess it did not make it to production because it did not add much accuracy to the attacks we simulated. I could provide more info/code if necessary. I hope this is useful.
@Sorkanius Ah I think I remember you looking at that metric. If you still have access to some of the analysis or code from that work that you could share that would be much appreciated.
Here is the function in python:
def histogram_distance(ref_fr, asset_fr, eps=1e-15):
bins = [8, 8, 8]
hist_a = cv2.calcHist([ref_fr], [0, 1, 2],
None, bins, [0, 256, 0, 256, 0, 256])
hist_a = cv2.normalize(hist_a, hist_a)
hist_b = cv2.calcHist([asset_fr], [0, 1, 2],
None, bins, [0, 256, 0, 256, 0, 256])
hist_b = cv2.normalize(hist_b, hist_b)
hist_a = hist_a.flatten()
hist_b = hist_b.flatten()
# Chi-squared distance
return 0.5 * np.sum([((a - b) ** 2) / (a + b + eps) for (a, b) in zip(hist_a, hist_b)])
The number of bins of the histogram is a parameter that can be changed depeding on the results, the more bins, the more precission and slower computations. If I remember correctly we used [8,8,8]
. The Epsilon (eps
) protects the division from zeros on the denominator. The input of these function must be two images loaded with openCV with three channels, this complicates the code since we now need to save in memory the images with three channles before transforming them from grayscale, we did have this in previous versions of the code, not sure if we have it now. For a proper comparison they must be with the same channel convention. If the attack has, for example changed RGB to BGR a greater chi-squared distance will appear.
Let me know if this helps!
Here is a quick example using the previous function to provide a more insightful idea:
import cv2
import matplotlib.pyplot as plt
import numpy as np
# path to lena image
img = cv2.imread('lena.jpg')
# openCV loads in BGR by default.
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_downscaled = cv2.resize(img, (0,0), fx=0.33, fy=0.5)
img_rescaled = cv2.resize(img_downscaled, img.shape[:2])
# img represents the original image
# img_rgb represents an attack in color
# img_rescaled represents a rendition from different resolution
plt.figure(figsize=(40,20))
plt.subplot(131), plt.imshow(img)
plt.subplot(132), plt.imshow(img_rgb)
plt.subplot(133), plt.imshow(img_rescaled)
# Distances
orig_vs_attack = histogram_distance(img, img_rgb)
orig_vs_rescaled = histogram_distance(img, img_rescaled)
print(f'The chi-squared distance between the original and the color attack is {orig_vs_attack:.2f}')
print(f'The chi-squared distance between the original and the rescaled version (not an attack) is {orig_vs_rescaled:.2f}')
And this is the result:
The chi-squared distance between the original and the color attack is 4.21
The chi-squared distance between the original and the rescaled version (not an attack) is 0.03
As you can see images where the colors have changed provide higher chi-squared distances, meanwhile images that have been just rescaled, what could happen with an asset that was in a different resolution, have lower distances.
@Sorkanius thanks, yes, this metric is not used in current version and it would help to prevent channel manipulations. I'm not sure if it still might be theoretically possible to generate an attack image both preserving original histograms and keeping V values intact.
I can think of flips and pixel shuffling, but the other metrics will help to easily detect these attacks.
Closed by #116
Summarizing a finding by @cyberj0g:
At the moment, all metrics are computed on the V component of a HSV frame where V = max(R, G, B). This means that pixel values in a frame can be manipulated in a few ways:
An example.