grip-unina / TruFor

TruFor
140 stars 9 forks source link

Question about NoisePrint++ (Editing History) #9

Open Elon-VVV opened 1 year ago

Elon-VVV commented 1 year ago

Hello, your work on manipulation detection is very impressive, and I am highly interested in your paper. However, I am having some difficulty in understanding the training steps of the proposed NoisePrint++. I have encountered several obstacles in this regard. Would you happen to have any plans to publish the dataset and training code specifically for the proposed NoisePrint++?

Elon-VVV commented 1 year ago

In other words, could you please provide a detailed list of the 512 types of editing history?

fabrizioguillaro commented 1 year ago

Hello! We plan to make the training code available, but we don't have a release date for it yet.

As for the dataset, we are still checking for the licenses of the images for the distribution of the dataset used for Noiseprint++ training, given also that the website DPreview is shutting down. As soon as the dataset is ready for distribution we will provide it.

Meanwhile, you could find useful to explore the code of the previous version of Noiseprint, since it follows a similar methodology: https://grip-unina.github.io/noiseprint/

The 512 editing histories are a combination of the following:

list_scale = [8/8, 2/8, 3/8, 4/8, 5/8, 6/8, 7/8, 9/8]
list_adjust = [
               ( 0.0,1.0,1.0), # identity
               ( 0.0,1.0,0.8), # gamma
               ( 0.3,1.0,1.0), # brightness
               ( 0.0,0.7,1.0), # contrast
               ( 0.0,1.4,1.0), # contrast
               (-0.3,1.0,1.0), # brightness
               ( 0.0,1.0,1.2), # gamma
               ( 0.0,0.7,1.2), # contrast & gamma
               ]
list_jpeg =  [  0, 90,  85,  80,  75,  70,  65,  60]
def cv2_adjust(img, factors):
    beta, alpha, gamma = factors
    lut = np.arange(0, 256) / 255.0
    lut = (alpha*(lut-0.5) + beta + 0.5) ** gamma
    lut = np.clip(255*lut, 0, 255).astype(np.uint8)
    return cv2.LUT(np.array(img), lut)
def cv2_scale(img, scale):
    dst = cv2.resize(img, None, fx = scale, fy = scale, interpolation = cv2.INTER_CUBIC)
    return dst
def cv2_jpeg(img, quality):
    encode_param = [int(cv2.IMWRITE_JPEG_QUALITY), quality]
    is_success, buffer = cv2.imencode(".jpg", img,encode_param)
    io_buf = BytesIO(buffer)
    return cv2.imdecode(np.frombuffer(io_buf.getbuffer(), np.uint8), -1)
Elon-VVV commented 1 year ago

Thank you very much for your detailed reply. If I understand correctly, the sequence of the aforementioned three operations is: firstly adjustment, followed by rescaling, and lastly JPEG, is that correct? Your clarification is greatly appreciated.

fabrizioguillaro commented 1 year ago

The order is rescaling, adjustment, JPEG

Elon-VVV commented 1 year ago

Thank you once again for your help. I have been quite confused by the following problems:

  1. Why the proposed NoisePrints++ has the capability to detect the copy-move image, as demonstrated in Fig. 4 of your paper?
  2. Why there is a need for the same patches restriction during the contrastive Learning process, as depicted in Fig.3 ? Your insights would be immensely valuable.
fabrizioguillaro commented 1 year ago
  1. Both Noiseprint and Noiseprint++ should be able to detect copy-moves, but in the Noiseprint++ it could appear more clear because of the new training strategy. In general, they are able to detect copy-moves because even if the tampered region comes from the same image, it still comes from a different position (we enforce the noiseprint to be different if training patches come from different spatial position in the image). Moreover, resampling introduced during the resizing and rotation of the manipulated area alter the noise pattern and make the copied region visible.
  2. Can you elaborate more the second question? What do you mean by patches restrictions? I will try to summarize the process: with contrastive learning you compare patches and you want the distance (the loss) between them to be 0 when the outputs should be similar and to be as large as possible when the outputs should be different. In other words, you push similar things close to each other, and different things far from each other. In our case, we want two patches having the same editing history, same position and same camera model to output the same noiseprint (regardless of the semantic content of the image). If one of the three is different, such as the camera model, then also the noise fingerprint has to be different.
Elon-VVV commented 1 year ago

Q.1 have been helpful in understanding Q.2 for me, that the same patches restriction is needed for copy-move detection by NoisePrint++. Regarding NoisePrints++. I understand that you've found it might not perform as well with images that have undergone double JPEG compression. In light of this, I'm wondering if it might be beneficial to extend the 512 editing history with additional double compression editing history, rather than play a simple JPEG transform after 512 editing history for each image during the contrastive learning?