Open hank0316 opened 1 month ago
Having used the new setup recently, it's all somewhat in flux, but I am personally trying to avoid copying or depending on a fork.
My first thought would be:
not_quantized=True
captures this?Yeah I mostly agree with @sanderland, but I understand if it's a research project / lesson on using open source code.
Specifics:
wandb
into main (@vwxyzjn and I have discussed how I am rebuilding wandb with some recent features).load_in_8bit
converts it to 8bit. This is nice for speed, but some models don't quantize well. TLDR: I'm very happy to have help on further improving the datatype handling.Operationally, it's not that easy to extract changes in a flattened repo. Normally, you want make a new fork and re-apply them. I don't really have much time to do that, but would love to see the additions. I'm sure claude/chatgpt can whip up some quick bash scripts for creating git diff's from a specific commit.
Let me know what you think @hank0316
Thank you guys @natolambert and @sanderland for the replies.
My thoughts:
wandb
: Currently, I use wandb
mainly to upload evaluation results for each subset. While integrating wandb
might not be essential for releasing my code—since users can access evaluation results through stdout or the results folder—I'm open to assisting further if needed. Please let me know how I can contribute to enhancing this aspect.torch_dtype=torch.float16
and set load_in_8bit=False
when loading the RM, since I didn't use 8-bit quantization. After a quick review of the latest version of reward-bench, it seems we can continue with the existing loading logic without any modifications. However, I'm available to discuss potential refinements to ensure alignment with any new updates.I appreciate your guidance on these modifications. Please let me know if there are specific procedures or additional insights required for integrating these changes.
Thanks again for your time and help!
Yup @hank0316 opening PR(s) is best. I'll provide feedback from there.
Hi Nathan,
I’m currently preparing to release a new repository that contains the code used in my paper. As part of our experiments, we made some slight modifications to the reward-bench code (we're using the v0.1.0-dev version).
The changes include:
load_in_8bit
actually does, and I just want to load our RM infp16
).I'm reaching out to ask about the best practice for including reward-bench with our changes into our repo. At the moment, we’ve removed the .git directory and committed the entire reward-bench codebase to our repository.
Is there a better approach to incorporate reward-bench while maintaining our local modifications? Any advice would be much appreciated!
Thank you for your time and help!
Best regards, Tzu-Han