If params_with_grad remains empty, the fused CUDA kernel will crash without error due to trying to index into an empty list. This PR first fixes the CUDA kernel so it throws a more meaningful error.
In addition, in Python, it skips the whole dispatching to update sub-functions whenever the params_with_grad list is empty. This is also necessary because empty lists aren't handled in torch.__foreach__ functions either.
If
params_with_grad
remains empty, the fused CUDA kernel will crash without error due to trying to index into an empty list. This PR first fixes the CUDA kernel so it throws a more meaningful error. In addition, in Python, it skips the whole dispatching to update sub-functions whenever theparams_with_grad
list is empty. This is also necessary because empty lists aren't handled intorch.__foreach__
functions either.