NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.17k stars 1.35k forks source link

add async copy for openfold swa triton kernel #1758

Closed azazhu closed 6 months ago

azazhu commented 7 months ago

This MR is to reduce cpu-gpu interaction for openfold