Open Vital1162 opened 1 month ago
Oh I had not seen this new trainer code yet ... Okay lets try that ...Thank buddy.
Observation: It is just as slow as 'trainer' -- 5-6 hour eta for 280 step train.
@Vital1162 We worked with the Hugging Face team to add the fix into transformers!
You'll have to use the latest transformers temporarily (you can continue using unsloth_train
or just use trainer.train()
:
!pip install unsloth
# Also get the latest nightly Unsloth!
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip uninstall transformers -y && pip install --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git"
Thanks GUYS ... You guys do good work.
@danielhanchen Thank you for your reply I haven't rechecked but when I run a trainer requires me to API of wandb
Is it still ok if I disable it?
wandb.init(mode="disabled")
wandb: WARNING The `run_name` is currently set to the same value as `TrainingArguments.output_dir`. If this was not intended, please specify a different run name by setting the `TrainingArguments.run_name` parameter.
@Vital1162 Oh sorry just fixed it - see https://github.com/unslothai/unsloth/issues/1153 ie
I updated all training notebooks - please edit the TrainingArguments
part by adding report_to = "none"
. For example:
args = TrainingArguments(
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
...
),
should be edited to:
args = TrainingArguments(
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
...
report_to = "none", # Use this for WandB etc
),
after the gradient accumulation fix, I tried to continue the pre-trained Llama 3.2 3B model for my datasets but got these issues when training. Does anyone have a solution?