Defend against RGB channel manipulation

yondonfu commented 4 years ago

Summarizing a finding by @cyberj0g:

At the moment, all metrics are computed on the V component of a HSV frame where V = max(R, G, B). This means that pixel values in a frame can be manipulated in a few ways:

Change the channel order for an image (i.e. OpenCV stores the channels in BGR order, but the order could be changed to something else)
Modify the channel values for a grayscale image
- Grayscale is calculated as 0.299R + 0.587G + 0.114B
- An example of a manipulated grayscale image could have the values R=R, G=B(0.114/0.587), B=G*(0.587/0.114) - this results in the same grayscale image value

An example.

Sorkanius commented 4 years ago

On the matter of this topic. There was a metric we designed that consisted on computing the histogram of colors of both reference and rendition. The chi-squared distance between the both histograms was computed per pair of frames. Then it was averaged over time as the rest of the metrics.

I guess it did not make it to production because it did not add much accuracy to the attacks we simulated. I could provide more info/code if necessary. I hope this is useful.

yondonfu commented 4 years ago

@Sorkanius Ah I think I remember you looking at that metric. If you still have access to some of the analysis or code from that work that you could share that would be much appreciated.

Sorkanius commented 4 years ago

Here is the function in python:

def histogram_distance(ref_fr, asset_fr, eps=1e-15):

    bins = [8, 8, 8]

    hist_a = cv2.calcHist([ref_fr], [0, 1, 2],
                          None, bins, [0, 256, 0, 256, 0, 256])
    hist_a = cv2.normalize(hist_a, hist_a)
    hist_b = cv2.calcHist([asset_fr], [0, 1, 2],
                          None, bins, [0, 256, 0, 256, 0, 256])
    hist_b = cv2.normalize(hist_b, hist_b)

    hist_a = hist_a.flatten()
    hist_b = hist_b.flatten()

    # Chi-squared distance
    return 0.5 * np.sum([((a - b) ** 2) / (a + b + eps) for (a, b) in zip(hist_a, hist_b)])

The number of bins of the histogram is a parameter that can be changed depeding on the results, the more bins, the more precission and slower computations. If I remember correctly we used [8,8,8]. The Epsilon (eps) protects the division from zeros on the denominator. The input of these function must be two images loaded with openCV with three channels, this complicates the code since we now need to save in memory the images with three channles before transforming them from grayscale, we did have this in previous versions of the code, not sure if we have it now. For a proper comparison they must be with the same channel convention. If the attack has, for example changed RGB to BGR a greater chi-squared distance will appear.

Let me know if this helps!

Sorkanius commented 4 years ago

Here is a quick example using the previous function to provide a more insightful idea:

import cv2
import matplotlib.pyplot as plt
import numpy as np

# path to lena image
img = cv2.imread('lena.jpg')
# openCV loads in BGR by default.
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_downscaled = cv2.resize(img, (0,0), fx=0.33, fy=0.5) 
img_rescaled = cv2.resize(img_downscaled, img.shape[:2]) 

# img represents the original image
# img_rgb represents an attack in color
# img_rescaled represents a rendition from different resolution

plt.figure(figsize=(40,20))
plt.subplot(131), plt.imshow(img)
plt.subplot(132), plt.imshow(img_rgb)
plt.subplot(133), plt.imshow(img_rescaled)

# Distances
orig_vs_attack = histogram_distance(img, img_rgb)
orig_vs_rescaled = histogram_distance(img, img_rescaled)

print(f'The chi-squared distance between the original and the color attack is {orig_vs_attack:.2f}')
print(f'The chi-squared distance between the original and the rescaled version (not an attack) is {orig_vs_rescaled:.2f}')

And this is the result: download

The chi-squared distance between the original and the color attack is 4.21
The chi-squared distance between the original and the rescaled version (not an attack) is 0.03

As you can see images where the colors have changed provide higher chi-squared distances, meanwhile images that have been just rescaled, what could happen with an asset that was in a different resolution, have lower distances.

cyberj0g commented 4 years ago

@Sorkanius thanks, yes, this metric is not used in current version and it would help to prevent channel manipulations. I'm not sure if it still might be theoretically possible to generate an attack image both preserving original histograms and keeping V values intact.

Sorkanius commented 4 years ago

I can think of flips and pixel shuffling, but the other metrics will help to easily detect these attacks.

yondonfu commented 4 years ago

Closed by #116

livepeer / verification-classifier

Defend against RGB channel manipulation #115