I am currently training a model using DPO, and I'm adapting the dataset dynamically during training. My current approach looks like this:
trainer = DPOTrainer(
model,
None,
args=training_args,
train_dataset=dataset,
tokenizer=tokenizer,
peft_config=peft_config,
beta=args.beta,
max_prompt_length=1024,
max_length=1536,
)
for i in range(repetitions):
train_result = trainer.train()
# Adapt the dataset based on the training result
dataset = get_adapted_dataset(train_result)
with PartialState().local_main_process_first():
# Tokenize the updated dataset
print("Updating the training dataset")
trainer.train_dataset = dataset.map(trainer.tokenize_row, num_proc=None)
Is this the correct way to adapt the dataset during training, or is there a more appropriate approach for this scenario?
Using an iterable dataset might be more suited. If the way you update the dataset depends on the results, you'll probably need to set a callback as well
Hello,
I am currently training a model using DPO, and I'm adapting the dataset dynamically during training. My current approach looks like this:
Is this the correct way to adapt the dataset during training, or is there a more appropriate approach for this scenario?