devilismyfriend / StableTuner

Finetuning SD in style.
GNU Affero General Public License v3.0
659 stars 53 forks source link

Feature Request: Randomize tags after the first comma #86

Open grimulkan opened 1 year ago

grimulkan commented 1 year ago

This could help when training with long captions > 75 tokens in length. The caption text can be split on commas (for example) into tags, and the tag order can be shuffled every epoch, so the part that gets truncated is different between epochs.

EveryDream has the option to exclude the first ("title") tag from this, and then to uniformly shuffle the remaining tags, or else to specify a probability for each tag to be shuffled in a separate json file.

This could be extended to only deal with the problem of truncation, and not shuffle the tags otherwise. For captions with >75 tokens, the first N tokens (or comma separated tags) could be held fixed, while the rest of the caption text could be populated randomly from the truncated section every epoch.

EDIT: I see talk of extended token limits in the PR. Maybe relevant?

Campfirecrucifix commented 1 year ago

Would love to see this implemented.

grimulkan commented 1 year ago

PR #91 basically adds what egs. the lpw custom pipeline adds for base diffusers. Token lengths can be multiple times 75, and weightings per token can be added.

If approved, that addresses both use cases of randomizing tags, I think?

grimulkan commented 1 year ago

Looks like it was closed and doesn't actually do what we'd like it to :( So feature request still open!

Edit: Also I am dumb and the PR was for inference and not training. Don't think there is an equivalent of lpw_stable_diffusion for training. Everydream's randomizing approach is probably how many people are dealing with it.