jiaweizzhao / GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Apache License 2.0
1.24k stars 131 forks source link

be a bit more lenient on transformers version #5

Closed winglian closed 3 months ago

winglian commented 3 months ago

Hi there! Amazing research on this. We're looking to integrate galore into the axolotl project here https://github.com/OpenAccess-AI-Collective/axolotl/pull/1370

One issue I ran into is the transformers dependency pin is a bit strict, so it would be great if we could loosen it a bit. Thanks!

jiaweizzhao commented 3 months ago

Hi @winglian, the optimizer module itself supports latest transformers dependency, but torchrun_main.py is a bit outdated. I will merge your request once we upgrade torchrun_main.py. For now feel safe to use latest dependencies if you only call optimizer in /galore_torch.

jiaweizzhao commented 3 months ago

PR can be closed as galore-torch does not require specific transformers version anymore