Closed gnadathur closed 5 months ago
Validate QPS and numerics parity on 8 GPU devGPU and 64 GPU AWS.
On 8 H100 GPUs:
Full numerics validation on 8 GPUs is included in https://github.com/pytorch/torchtrain/pull/165.
Validate QPS and numerics parity on 8 GPU devGPU and 64 GPU AWS.