vwxyzjn / lm-human-preference-details

RLHF implementation details of OAI's 2019 codebase
MIT License
145 stars 7 forks source link

Add accelerate to poetry dependencies #9

Closed liutianlin0121 closed 1 year ago

liutianlin0121 commented 1 year ago

Hey Costa,

It seems that the 'accelerate' package isn't part of the poetry dependencies, although we're currently using it. I can submit a PR to update the pyproject.toml and poetry.lock files to include accelerate.

In addition, I encountered errors in calling reward_model.module.reward_gain and reward_model.module.reward_bias in the reward learning script, like here, and here. The error message is AttributeError: 'AutoModelForCausalLMWithRewardHead' object has no attribute 'module' for me. However, directly using reward_model.reward_gain and reward_model.reward_bias work just fine. Is there a reason why module is important? Should we change reward_model.module.reward_gain to reward_model.reward_gain?

(I see! reward_model.module is needed when using multiple GPUs with accelerate.)

Tianlin

vwxyzjn commented 1 year ago

Hi @liutianlin0121, thanks for this issue. I can't believe that I didn't add accelerate to poetry. Could you make a PR? You can probably run poetry add accelerate@latest.

(I see! reward_model.module is needed when using multiple GPUs with accelerate.)

Yeah, so this is an unfortunate limitation of torch: for some reason under the distributed mode, the module is no longer directly accessible. Torch also kind of forces you to use exclusively forward function. In comparison, jax doesn't have these two issues.

liutianlin0121 commented 1 year ago

I see, thank you!