Open zou3519 opened 1 year ago
On A100s, seeing 107ms to 111ms ~4% regression
On AWS V100s, I'm seeing 169ms to 174ms which is ~3% regression
With V100 on FAIR cluster I see 190ms to 211ms which is roughly 5%, so I can repro your v100 numbers @samdow. These numbers aren't crazy enough to investigate for 1.13, but might be worth looking into if this is overhead or something else
On my machine with ~v100~ P100 GPUs, the runtime goes from 286ms to 316ms
To repro: