The method works great on most layers but on the final projection in my transformer (1024 x 50k) I get
RuntimeError: cusolver error: CUSOLVER_STATUS_INVALID_VALUE, when calling `cusolverDnSgesvdj_bufferSize(handle, jobz, econ, m, n, A, lda, S, U, ldu, V, ldv, lwork, params)`
when executing U, s, Vh = torch.linalg.svd(matrix).
The issue is fixed by using U, s, Vh = torch.linalg.svd(matrix, full_matrices = False)
The method works great on most layers but on the final projection in my transformer (1024 x 50k) I get
when executing
U, s, Vh = torch.linalg.svd(matrix)
.The issue is fixed by using
U, s, Vh = torch.linalg.svd(matrix, full_matrices = False)