allenai / RL4LMs

A modular RL library to fine-tune language models to human preferences
https://rl4lms.apps.allenai.org/
Apache License 2.0
2.18k stars 191 forks source link

Problems with models that don't have the parallelize() function #25

Open lovodkin93 opened 1 year ago

lovodkin93 commented 1 year ago

Hey, First of all thank you for this amazing repo! I am trying to employ this repo with a model that is does not have the parallelize() function (led - the longformer encoder-decoder). Now - from what I have observed - such models are simply wrapped in a DataParallel decorator. The problem is this causes many bugs that stem from the lack of parallelize function. For example, many time the get_policy_first_device function is called, which searches for the first_device parameter in the models, which is inserted when parallelize is called (and there are many other issues). I did notice a similar issue has already been addressed and so I was wondering if there are plans to properly treat such models. Thanks!

rajcscw commented 1 year ago

Hey, you can turn off the model parallel by setting this flag https://github.com/allenai/RL4LMs/blob/aa5d337c4c587049e039d572042bf5c95926c3be/scripts/training/task_configs/synthetic_generate_increasing_numbers/blendorbot_ppo.yml#L41.

This would wrap the model with DataParallel instead.