jiaweizzhao / GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Apache License 2.0
1.24k stars 131 forks source link

Update galore_projector.py #50

Open jetaudio opened 1 month ago

jetaudio commented 1 month ago

Deal with tensors are distributed in different devices.