jiaweizzhao / GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Apache License 2.0
1.24k stars 131 forks source link

Update torchrun_main.py #11

Closed darthjaja6 closed 3 months ago

darthjaja6 commented 3 months ago

c4 will soon be deprecated, using allenai/c4 instead