Hyperparameter search is a must known activity to get hyperparameters which are for optimised machine learning or deep learning model output. I was trying hyperparameter_search() method to get optimal hyperparameters values. I wanted to tune LoRA confugarition parameters like rank r and alpha too. But i found that it is not able to finetune LoRA configuration paramaters.
Motivation
Following code part will give you the idea
def model_init():
device_map = {"": torch.cuda.current_device(
)} if torch.cuda.is_available() else None
model_kwargs_dict = dict(
# set this to True if your GPU supports it
#(Flash Attention drastically speeds up model computations)
#attn_implementation="flash_attention_2",
torch_dtype="auto",
# set to False as we're going to use gradient checkpointing
use_cache=False,
device_map=device_map,
)
device_map = {"": torch.cuda.current_device(
)} if torch.cuda.is_available() else None
bnb_config_args = dict(load_in_4bit = True,
bnb_4bit_quant_type = "nf4",
bnb_4bit_compute_dtype = torch.bfloat16,
bnb_4bit_use_double_quant = False)
bnb_config = BitsAndBytesConfig(
**bnb_config_args
)
model_kwargs_dict["quantization_config"] = bnb_config
model = AutoModelForCausalLM.from_pretrained("facebook/opt-125m",
return_dict=True,
#**model_kwargs_dict
)
print(peft_config)
model = get_peft_model(model, peft_config = peft_config)
return model
dataset = load_dataset("imdb", split="train")
tokenizer = AutoTokenizer.from_pretrained("facebook/opt-125m")
dataset1 = dataset.select([0, 10, 20, 30, 40, 50])
dataset2 = dataset.select([0, 10, 20, 30, 40, 50])
trainer = SFTTrainer(
model=None,
args=training_args,
model_init=model_init,
tokenizer=tokenizer,
train_dataset=dataset1,
eval_dataset= dataset2,
dataset_text_field="text",
max_seq_length=512,)
def optuna_hp_space(trial):
return {
"learning_rate": trial.suggest_float("learning_rate", 1e-6, 1e-4, log=True),
"per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [16, 32, 64, 128]),
"r": trial.suggest_float("r", 2, 4, log=True),
}
trainer.hyperparameter_search(direction=["minimize"],
backend="optuna",
hp_space=optuna_hp_space,
n_trials=2)
The code above has resulted in output like
[I 2024-02-29 09:42:36,869] A new study created in memory with name: no-name-331fbdff-6465-42f8-9c97-ad5c6c8c4703
Trying to set r in the hyperparameter search but there is no corresponding field in TrainingArguments.
Feature request
Hyperparameter search is a must known activity to get hyperparameters which are for optimised machine learning or deep learning model output. I was trying hyperparameter_search() method to get optimal hyperparameters values. I wanted to tune LoRA confugarition parameters like rank r and alpha too. But i found that it is not able to finetune LoRA configuration paramaters.
Motivation
Following code part will give you the idea
The code above has resulted in output like
[I 2024-02-29 09:42:36,869] A new study created in memory with name: no-name-331fbdff-6465-42f8-9c97-ad5c6c8c4703 Trying to set r in the hyperparameter search but there is no corresponding field in
TrainingArguments
.Your contribution
NA