kijai / ComfyUI-FluxTrainer

Apache License 2.0
347 stars 12 forks source link

[Feature Request] Avoid Duplication of TrainLoop and TrainValidate Nodes #45

Open m0rphism opened 1 week ago

m0rphism commented 1 week ago

Currently, the user needs to create one copy of the "Flux Train Loop", "Flux Train Validate", and "Flux Train Save LoRA" nodes for each time the user wants to save the current model and create validation images.

This doesn't scale well for large training sessions where one wants to save and validate often.

On larger datasets I often train for over 50000 steps and save and validate every 500 steps, which would require me to create 100 copies of those nodes.

Even if a large amount of steps is not always necessary, this workflow can be nice for exploratory trainings, where one simply continues training indefinitely (using constant learning rate). This way one can just let the training run in the background and every now and then check the validation images to see if sufficiently many concepts have been captured by the LoRA.

For those use cases, it would be nice if it would be possible to create those nodes only once, and have the steps parameter of the "Flux Train Loop" node mean "every n steps" instead of "after n steps".

Unfortunately, I'm not sure if it would then be possible to still use ComfyUI's computation graph, because the "Flux Train Validate" node would need to produce multiple outputs over time within one session created by clicking the "Queue prompt" button.

It might be possible by not using the "Preview Image" node, but instead using custom frontend display logic for the "Flux Train Validate" node, e.g. that the image is displayed as part of the "Flux Train Validate" node and the "Flux Train Validate" node can control itself when to update it, similar like the original KSampler node updates its preview images within one session.

Alternatively, one could provide a specialized "Flux Train Loop" node which does not use comfyui for displaying the results but simply writes the LoRA and validation images into a directory after each N training steps without using the ComfyUI computation graph logic. This way the user could then use a regular file browser and image viewer to inspect the current training progress.

kijai commented 1 week ago

Would love to do all this, but like you said it's mostly UX issue... no problem at all to do if we forget about showing the validation results of course. With the new comfy features that came with the execution order reversion would probably allow accumulating validation results to a batch or something though.