Open playerzer0x opened 3 days ago
well, that is normal. you are no longer resuming the old training run, as you have changed everything.
it's not really recommended to change anything within a single training run, let alone the entire dataset or the step schedule
This change would be across two separate training runs. I'm following Caith's recommendation on training new subjects into a "base LoKR" that was previously trained on styles.
you want to use --init_lora
to begin a new training run from the old lokr then. it takes a path to the safetensor file
you want to use
--init_lora
to begin a new training run from the old lokr then. it takes a path to the safetensor file
Tried this but trainer threw a tensor size error on first step. Went back to using epochs and training starts fine.
2024-11-21 01:52:34,920 [INFO] Reached the end (58 epochs) of our training run (42 epochs). This run will do zero steps.
If I set max_train_steps to 0 and change num_train_epochs to 100, training starts fine. Haven't counted, but the updated dataset for resume may be less than the original dataset used.
My brain thinks in steps, so would prefer to use steps over epochs.