Open Xieyangxinyu opened 2 weeks ago
Thank you so much for bringing this up! We really appreciate your insights on the masking of repeated k-grams in the context of watermarking algorithms.
Based on our understanding, both Unbiased and Dipmark are single-step distortion-free methods. To enhance multi-step distortion-free capabilities, they introduced the context mask. In contrast, methods like KGW are single-step biased method, so adding a context mask doesn’t provide the same benefits.
We understand your concerns regarding fair comparisons. If Unbiased and Dipmark utilize a context mask, it won’t embed watermark at every position, which could impact detectability. To address this, we plan to implement detection configurations for these methods. This will allow the option to apply context masking during detection, ensuring that repeated context positions are excluded from consideration.
We’ll be releasing this code soon. Thank you again for your valuable feedback! 😊
Hi,
I noticed that for DIP and Unbiased, there is this piece of code inside
__call__
function within the logit processor:It seems that this piece of code is used to mask out the repeated $k$-grams of context used to generate watermarks. However, this kind of repeated context masking is independent of the actual logit-reweighing part of the watermarking algorithms and could be supported for all algorithms to ensure fair comparisons?
If you could also support this, that'll be wonderful. Thank you so much!