I tried to use GaLore on nn.Linear(256, 267736).
Then I got the following error:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 267.04 GiB. at U, s, Vh = torch.linalg.svd(matrix).
I think full_matrices=False may be required at torch.linalg.svd.
I tried to use GaLore on nn.Linear(256, 267736). Then I got the following error:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 267.04 GiB.
atU, s, Vh = torch.linalg.svd(matrix)
. I thinkfull_matrices=False
may be required at torch.linalg.svd.