CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
MIT License
4.49k stars 472 forks source link

Migrate to `peft` from `opendelta` for parameter efficient tuning methods #434

Open jon-tow opened 1 year ago

jon-tow commented 1 year ago

🚀 The feature, motivation, and pitch

Let's migrate to peft.

Tasks

Doing so will require the following updates:

  1. Replace the opendelta setup in the AccelerateBaseTrainer with a peft backed setup: https://github.com/CarperAI/trlx/blob/92b68e4d8c5d59e6ba25d12fd9acfe10287be689/trlx/trainer/accelerate_base_trainer.py#L145-L155

  2. Handle fine-grained layer capturing to only modify the upper trunk layers of hydra architectures as handled below: https://github.com/CarperAI/trlx/blob/92b68e4d8c5d59e6ba25d12fd9acfe10287be689/trlx/utils/modeling.py#L414-L428

Motivation

Citing @ethankim00's concerns with opendelta:

Alternatives

No response

Additional context

No response

jon-tow commented 1 year ago

Assignee: @glerzing will be taking a go at this :)

loganlebanoff commented 1 year ago

I'm curious as to the status of this issue/PR

glerzing commented 1 year ago

I'm developing automated tests for it, there should be a PR soon.

akk-123 commented 1 year ago

@glerzing look forward it

akk-123 commented 1 year ago

@glerzing when will have a pr?

glerzing commented 1 year ago

It should be ready for PR tomorrow, sorry for the wait.