torch compile and cuda alloc improvements

axolotl-ai-cloud / axolotl

Go ahead and axolotl questions

https://axolotl-ai-cloud.github.io/axolotl/

Apache License 2.0

7.48k stars 808 forks source link

Closed winglian closed 1 month ago

winglian commented 1 month ago

add back missing torch.compile to hf trainer set cuda allocation env var to improve llama vram use by up to 17%