Closed ngc92 closed 5 days ago
Moves loss calculation to backward, and ensures we can do more on-device reductions and fewer host<->device transfers. Also enables a micro-optimization, that validate does not calculate dlogits anymore.
Moves loss calculation to backward, and ensures we can do more on-device reductions and fewer host<->device transfers. Also enables a micro-optimization, that validate does not calculate dlogits anymore.