Closed AnimeshSinha1309 closed 1 year ago
Replaced for-loop with batched sparse matrices. Considerable speedup over previous times (approx. 30x). Closing for now without implementing the module in CUDA/C++.
for-loop