Closed zjysteven closed 1 year ago
This colab notebook (adapted from the provided demo) also reproduces this problem.
Hi Jingyang,
Indeed, the whitening changes the output of the extractor. Its goal is to make the output bits more independent and well distributed (see appendix B.5. and B.6. of the paper), otherwise, some keys would have higher FPRs than others. In our case, since we discard the encoder at the end (we only use the extractor for the fine-tuning in stable signature), we can change the extractor and do the whitening. However, if you use the encoder to watermark your images, the message that you end up putting in your images will be completely changed by the whitening layer at extraction time (tell me if I'm not clear enough).
There are 2 options: (1) use the vanilla extractor without whitening, this is not as much of an issue if you don't need perfect theoretical control over false positive rates, (2) you can try feeding "reversed" messages to the watermark encoder: for instance if your message is $m$ and the whitening layer is does $L(x) = Wx+b$, then you can try feeding $W^{-1}(m-b)$ as message to the encoder, s.t. when it get extracted the last layer will give $W(W^{-1}(m-b))+b = m$ (but I have never tried this so I can't say if it will work).
tl;dr: use the un-whitened extractor if you want to use the encoder of hidden
Thank you Pierre! I get the sense and now I see why this is actually expected.
Hi Pierre,
When I evaluated the whitened watermark decoder
hidden_replicate_whit.torchscript.pth
, the bit accuracy of clean (non-attacked) watermarked images is 49.42% (on 5000 COCO val images; each with a random key). I'm not sure what's wrong here and would appreciate any suggestions/thoughts. Below is some additional information that might be helpful.hidden_replicate.pth
, so I think that my evaluation script works okay.hidden/main.py
and observed similar results when doing the whitening.Thank you