Open kirudang opened 2 weeks ago
Our implementation does not directly support it, as we are using the same tokenizer for the LLM and the reward model. Since the models you mentioned use different tokenizers, you may need some adaptations in the code to get it work.
Hello there, I want to test this watermark for two model: Llama2 7B and Mistral 7B, can I use a same reward model, let's say OPT 1.3B? Thank you