facebookresearch / d2go

D2Go is a toolkit for efficient deep learning
Apache License 2.0
837 stars 200 forks source link

fix distributed initialization for FSDP #657

Closed fanyix closed 6 months ago

fanyix commented 6 months ago

Summary: Without properly set requires_grad for params and buffers, it causes hang in FSDP training. This becomes an issue eg when training with LoRA.

Reviewed By: wat3rBro

Differential Revision: D55220828

facebook-github-bot commented 6 months ago

This pull request was exported from Phabricator. Differential Revision: D55220828

facebook-github-bot commented 6 months ago

This pull request was exported from Phabricator. Differential Revision: D55220828

facebook-github-bot commented 6 months ago

This pull request has been merged in facebookresearch/d2go@b14282fb8380798ebd756781cd3e20a93538681c.