jiaweizzhao / GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Apache License 2.0
1.43k stars 148 forks source link

RuntimeError: diag(): Supports 1D or 2D tensors. Got 3D #17

Closed drimeF0 closed 7 months ago

drimeF0 commented 8 months ago
[/usr/local/lib/python3.10/dist-packages/galore_torch/galore_projector.py](https://localhost:8080/#) in get_orthogonal_matrix(self, weights, rank, type)
     85         #make the smaller matrix always to be orthogonal matrix
     86         if type=='right':
---> 87             A = U[:, :rank] @ torch.diag(s[:rank])
     88             B = Vh[:rank, :]
     89 

RuntimeError: diag(): Supports 1D or 2D tensors. Got 3D

As I understand it, galore is not able to work with models that work with 3D inputs/outputs?