Open x6p2n9q8a4 opened 3 months ago
Oh did you set random_state = 3407
in FastLanguageModel.get_peft_model
?
Oh did you set
random_state = 3407
inFastLanguageModel.get_peft_model
?
Yes! I set the seed many times:
1st: from transformers import set_seed as transformers_set_seed transformers_set_seed(3407)
2nd: model = FastLanguageModel.get_peft_model( model, r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128 target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj",], lora_alpha = 16, lora_dropout = 0, # Supports any, but = 0 is optimized bias = "none", # Supports any, but = "none" is optimized use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context random_state = 3407, use_rslora = False, # We support rank stabilized LoRA loftq_config = None, # And LoftQ )
3rd: trainer = SFTTrainer( model = model, tokenizer = tokenizer, train_dataset = dataset, eval_dataset = val_dataset, dataset_text_field = "text", max_seq_length = max_seq_length, dataset_num_proc = 2, packing = False, # Can make training 5x faster for short sequences. args = SFTConfig( per_device_train_batch_size = 2, gradient_accumulation_steps = 4, warmup_steps = 5, max_steps = 1500, learning_rate = 2e-4, fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, lr_scheduler_type = "linear", seed = 3407,...)
Oh apologies - do you know if it works fine now? If it's pure random training thats not good
Hi, is this fixed? I have the same issue. I set random seed in FastLanguageModel.get_peft_model and in SFTTrainer trainingarguments.
@EirikVinje Apologies on the delay - it should be fine hopefully - how different are the training runs? If it's on different GPUs / setups, then yes you will get different results
@danielhanchen the training runs on the same GPU (RTX 2080 Ti). These results is generated by the shellscript.
@danielhanchen Having the same issue as Eirik, using same hyperparameters but getting different results.
@danielhanchen I am using unsloth for a research project and have the same issue of differing results across identical runs with same seeding and on the same GPU. The inability to reproduce results is a large downside to all the great features of unsloth. I hope you manage to fix this as soon as possible as I am otherwise very happy with the features of the library!
@EirikVinje @TobiasBrambo @Skageb Apologies on the delay and the issue - hmm I just can't seem to repro it - are you saying the finetuning results are different or the generations are different? Generations need temperature = 0 to retain the same outputs.
@EirikVinje Ok that is a bit interesting - I'm just confused since every test I've done shows it's reproducible (I run Colab like every day to check the losses, and they match), so I'm stumped :(
I'll reopen this so I can investigate this more
@danielhanchen some models you cannot set temperature = 0.0, e.g "Qwen/Qwen2-0.5B-Instruct". Initially this was how the model was evaluated.
outputs = self.model.generate(**inputs, max_new_tokens=100, use_cache=True)
But I also tried with this:
outputs = self.model.generate(**inputs, max_new_tokens=100, use_cache=True, temperature=0.0)
May it be a problem running in .py instead of .ipynb ?
Oh set do_sample = False
Hi authors,
In the SFTTrainer, we set "seed = 3407". But I find the training procedure is still random. the performance of test dataset and the change of loss are different under same configs.
Thanks,