Bug fix: Correct function call in RewardModel->finetune_parameters

lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

MIT License

7.67k stars 668 forks source link

Closed QasimWani closed 1 year ago

QasimWani commented 1 year ago

lucidrains commented 1 year ago

@QasimWani yes indeed, thank you for the pull request!

i've fixed it in https://github.com/lucidrains/PaLM-rlhf-pytorch/commit/bb9d4eb59762bbf07783c3e89d148068e4a762e5 (also took the opportunity to address another issue, which is why i didn't merge your PR, which is fine btw!)

QasimWani commented 1 year ago

awesome! in case you're curious, i was able to find that bug by generating a graphical representation of your code using something I built over the past week. https://www.gctpy.com/graph/79f3f26b86b8ac37350d83307d8ad587d575a03a072b0fa7a77174d371772abf It helped me understand parts of your code faster.