jiaweizzhao / GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Apache License 2.0
1.24k stars 131 forks source link

Extend GaLore Algorithm for General Tensor Decomposition #48

Closed Robertboy18 closed 1 month ago

Robertboy18 commented 1 month ago

The GaLore algorithm was originally designed to perform lower-order gradient approximation for matrices using Singular Value Decomposition (SVD). This pull request extends the algorithm to support general tensor decomposition, allowing it to handle tensors of dimension greater than 2. A particular example is given in the usage of Neural Operators which are used to solve Partial Differential Equations (The tensors here are 5 dimensional).

Changes:

Benefits:

Please review the changes and provide any feedback or suggestions for improvement. Let me know if you have any questions or concerns.