Open kmehant opened 2 weeks ago
Prototype implementation for porting from FSDP V1 to FSDP V2. There are couple of open questions in this PR that would need comments and discussion.
The current version of the PR has been tested for basic functionality (full shard) and compared with previous FSDP V1 implementation.
TODO
Fixes #2873
@muellerzr
@ByronHsu FYI - thoughts?
What does this PR do?
Prototype implementation for porting from FSDP V1 to FSDP V2. There are couple of open questions in this PR that would need comments and discussion.
Preliminary run of this PR and results
The current version of the PR has been tested for basic functionality (full shard) and compared with previous FSDP V1 implementation.
Memory
Loss Parity
Throughput
TODO
Fixes #2873
Before submitting
Who can review?
@muellerzr