THU-BPM / MarkLLM

MarkLLM: An Open-Source Toolkit for LLM Watermarking.(EMNLP 2024 Demo)
https://aclanthology.org/2024.emnlp-demo.7/
Apache License 2.0
284 stars 29 forks source link

Repeated Context Masking #27

Open Xieyangxinyu opened 2 weeks ago

Xieyangxinyu commented 2 weeks ago

Hi,

I noticed that for DIP and Unbiased, there is this piece of code inside __call__ function within the logit processor:

    if input_ids.shape[-1] < self.config.prefix_length:
            return scores

        mask, reweighted_scores = self._apply_watermark(input_ids, scores)

        if self.config.ignore_history:
            return reweighted_scores
        else:
            return torch.where(mask[:, None], scores, reweighted_scores)

It seems that this piece of code is used to mask out the repeated $k$-grams of context used to generate watermarks. However, this kind of repeated context masking is independent of the actual logit-reweighing part of the watermarking algorithms and could be supported for all algorithms to ensure fair comparisons?

If you could also support this, that'll be wonderful. Thank you so much!

panly2003 commented 2 days ago

Thank you so much for bringing this up! We really appreciate your insights on the masking of repeated k-grams in the context of watermarking algorithms.

Based on our understanding, both Unbiased and Dipmark are single-step distortion-free methods. To enhance multi-step distortion-free capabilities, they introduced the context mask. In contrast, methods like KGW are single-step biased method, so adding a context mask doesn’t provide the same benefits.

We understand your concerns regarding fair comparisons. If Unbiased and Dipmark utilize a context mask, it won’t embed watermark at every position, which could impact detectability. To address this, we plan to implement detection configurations for these methods. This will allow the option to apply context masking during detection, ensuring that repeated context positions are excluded from consideration.

We’ll be releasing this code soon. Thank you again for your valuable feedback! 😊