CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
MIT License
4.44k stars 469 forks source link

TPU Integration #323

Open steventk-g opened 1 year ago

steventk-g commented 1 year ago

🚀 The feature, motivation, and pitch

trlX uses HuggingFace accelerate under the hood. Accelerate has the capability to leverage Google's TPUs for faster training. I'm interested in supporting trlX on TPUs. This can be user configurable using the TPU fields in the accelerate config.

Alternatives

No response

Additional context

I'm a SWE on the PyTorch/XLA team at Google, and I work with HF to test and support their libraries on TPU. I would be happy to help test and develop this feature as well.

steventk-g commented 1 year ago

@reciprocated feel free to assign this me, just wanted to put it on your radar.

joytianya commented 1 year ago

does it support tpu?

leejason commented 1 year ago

Any progress on TPU? Thanks for information.