Hi Authors, thank you so much for your huge contribution!! I'm pretty new to the optimization workarounds for training large models, so I'm struggling to get the training for Llama-7B started on my setup (8 Nvidia RTX A6000s each having 48 GB of GPU memory). What would you recommend changing the optimization config to get the training working in this case? Thank you so much!
Hi Authors, thank you so much for your huge contribution!! I'm pretty new to the optimization workarounds for training large models, so I'm struggling to get the training for Llama-7B started on my setup (8 Nvidia RTX A6000s each having 48 GB of GPU memory). What would you recommend changing the optimization config to get the training working in this case? Thank you so much!