Hey @catcathh Thank you for your great contribution!
When trying to run train_personalized.py - setup_optimizers() encounters an error
params += list(models.train_lora.module.parameters()) needs to be fixed to
params += list(models.train_lora.parameters())
Also if not running in DDP, need to fix backward_pass() and add:
def backward_pass(self, update, loss_adjusted, models: Models, optimizers: TrainingCore.Optimizers, schedulers: Schedulers):
if update:
if self.config.use_ddp:
torch.distributed.barrier()
loss_adjusted.backward()
grad_norm = nn.utils.clip_grad_norm_(models.train_lora.parameters(), 1.0)
Now it's failing on loading the lora... which I still didn't get around to fix
Currently, the LoRA training code only supports multi-GPU setups. If you need to train on a single GPU, you might have to adjust parts of the code like model loading.
Hey @catcathh Thank you for your great contribution!
When trying to run train_personalized.py - setup_optimizers() encounters an error
params += list(models.train_lora.module.parameters())
needs to be fixed toparams += list(models.train_lora.parameters())
Also if not running in DDP, need to fix backward_pass() and add:
Now it's failing on loading the lora... which I still didn't get around to fix