Support for Llama 3.1 and 3.2 fine tuning

Hello,

I am deeply interested in your Optimum-TPU project. Currently, I am planning to fine-tune the Llama 3.1 and 3.2 models on my native language and a specific domain, with a fairly large dataset (approximately 60B tokens). I am using Google TPU Pods, but I have been facing significant challenges in implementing model parallel training from scratch, saving unified checkpoints in the safetensors format, setting up appropriate logging, and configuring hyperparameters.

While exploring solutions, I came across the Optimum-TPU project, which seems incredibly useful. However, I noticed that it currently only supports up to Llama 3. Are there any plans to extend support to Llama 3.1 and 3.2 for fine-tuning? I strongly hope that future updates will include support for these versions as well.

Thank you for considering this request!

huggingface / optimum-tpu

Support for Llama 3.1 and 3.2 fine tuning #114