Reward model - Githubissues

xiaojunxu / learning-to-watermark-llm

MIT License

11 stars 0 forks source link

Reward model #2

Open kirudang opened 2 weeks ago

kirudang commented 2 weeks ago

Hello there, I want to test this watermark for two model: Llama2 7B and Mistral 7B, can I use a same reward model, let's say OPT 1.3B? Thank you

xiaojunxu commented 2 weeks ago

Our implementation does not directly support it, as we are using the same tokenizer for the LLM and the reward model. Since the models you mentioned use different tokenizers, you may need some adaptations in the code to get it work.