How to continue training a LoRA

NeuralNotW0rk / LoRAW

Flexible LoRA Implementation to use with stable-audio-tools

MIT License

39 stars 3 forks source link

How to continue training a LoRA #13

Open GoombaProgrammer opened 3 days ago

GoombaProgrammer commented 3 days ago

I trained a LoRA, stopped training, and I want to continue training from the same one. But using --lora-ckpt-path just errors with this traceback:

File "D:\tts\ok\stable-audio-tools-main\train.py", line 78, in main
    torch.load(args.lora_ckpt_path, map_location="cpu")["state_dict"]
KeyError: 'state_dict'

GoombaProgrammer commented 3 days ago

Ok, removing ["state_dict"] works (currently testing if training works).

GoombaProgrammer commented 3 days ago

I now get a different error. Might be because I changed the dataset a little.

NeuralNotW0rk commented 2 days ago

I can investigate the issue in more depth much later, but for now what is the new error you are getting?

GoombaProgrammer commented 2 days ago

I can investigate the issue in more depth much later, but for now what is the new error you are getting?

     norm_scale = self.dora_mag.weight.view(-1) / (torch.linalg.norm(new_weight_v, dim=1)).detach()
 RuntimeError: The size of tensor a (3072) must match the size of tensor b (1536) at non-singleton dimension 0

That is the new error

GoombaProgrammer commented 2 days ago

Nope, even when not changing the dataset between 2 runs, it still gives that error