vicgalle / stable-diffusion-aesthetic-gradients

Personalization for Stable Diffusion via Aesthetic Gradients 🎨
https://arxiv.org/abs/2209.12330
Other
716 stars 65 forks source link

Any guide for training our own .pt? #10

Open tcflying opened 1 year ago

tcflying commented 1 year ago

Thank you for the great work first, but is there any guide for training our own .pt? eg. how many sample picture should be training one or ? and batch size? Thanks.

eeyrw commented 1 year ago

This method do not train model in advance. It trains/finetunes language model CLIP to make your input text embedding similar to the style embedding you created in runtime. So you just use https://github.com/vicgalle/stable-diffusion-aesthetic-gradients/blob/main/scripts/gen_aesthetic_embeddings.py to generate image embedding created by pretrained CLIP image encoder.

tcflying commented 1 year ago

thanks a lot of the reply, i got it. exactly, i mean i should prepare how many pictures or batch to create embedding?

eeyrw commented 1 year ago

According repo owner's example, 3 images should be sufficient.

tcflying commented 1 year ago

According repo owner's example, 3 images should be sufficient.

thank you guys, as vicgalle mention: fantasy.pt: created from https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus by filtering only the images with word "fantasy" in the caption. The top 2000 images by score are selected for the embedding. flower_plant.pt: created from https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus by filtering only the images with word "plant", "flower", "floral", "vegetation" or "garden" in the caption. The top 2000 images by score are selected for the embedding.

he use 2000 pics, Also, how many batch? UM...

eeyrw commented 1 year ago

According chapter Using your own embeddings, he use 3 images as input to create embedding and achieve acceptable performance. But as you said, the author used thousands images to create embedding for actual usage. Maybe much is better. What do you mean of "batch"?