Closed fanyix closed 6 months ago
This pull request was exported from Phabricator. Differential Revision: D55220828
This pull request was exported from Phabricator. Differential Revision: D55220828
This pull request has been merged in facebookresearch/d2go@b14282fb8380798ebd756781cd3e20a93538681c.
Summary: Without properly set
requires_grad
for params and buffers, it causes hang in FSDP training. This becomes an issue eg when training with LoRA.Reviewed By: wat3rBro
Differential Revision: D55220828