unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.37k stars 1.28k forks source link

Training Setting #1300

Open nichellehouston opened 4 days ago

nichellehouston commented 4 days ago

For train model on 48G GPU to translate language how change setting to get highest quality

r = 16 lora_alpha = 16, lora_dropout = 0, random_state = 3407, per_device_train_batch_size = 2, gradient_accumulation_steps = 4, warmup_steps = 5, num_train_epochs = 1, learning_rate = 2e-4, seed = 3407,

dame-cell commented 4 days ago

@nichellehouston To get the highest quality it typically depends on the model and the quality of the datasets. but the main hyperparameters should be

# Model Hyperparameters
learning_rate = 2e-4  # Starting with a lower learning rate for stability
lora_alpha = 32  # Higher value to capture more complexity
num_train_epochs = 3  # Training for more epochs to capture better patterns
r = 16   #  Small r: Faster training, less adaptation capacity.
             # Large r: Slower training, more powerful adaptation (with more parameters).

train for a few steps and then stop save the loss for comparison next , change the values for the hyperparameters train again and compare the loss

Erland366 commented 3 days ago

I feel like you can increase the amount of per_device_train_batch_size to maximize your VRAM usage since your GPU is 48GB .-.