NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.42k stars 1.4k forks source link

add async copy for openfold swa triton kernel #1758

Closed azazhu closed 11 months ago

azazhu commented 11 months ago

This MR is to reduce cpu-gpu interaction for openfold