Open affanmehmood opened 2 years ago
That’s a different method of achieving a similar result… I believe the OP was talking about resuming training of the SD model itself. I am also very interested in this especially in light of the Dreambooth paper: https://dreambooth.github.io/ (I think it would be very interesting to try this approach with SD). There’s training code and settings for latent diffusion but I’m not sure if it would be fruitful to try it with stable especially without knowing the training parameters that were used.
Theres this but its ported and requires a beefy 48gb gpu https://github.com/Jack000/glid-3-xl-stable
Theres this but its ported and requires a beefy 48gb gpu https://github.com/Jack000/glid-3-xl-stable
Oh but that is exciting! I'll have to give it a try. (my naive theory is to try something similar to the dreambooth paper, by trying to find a prompt word that is basically unknown to SD, and then using that as the training captions for some new images)
Theres a user that managed to get "full model train with validation" on a 3090, but now 54gb of ram is needed. If he/she releases the code i will lyk @janekm
That’s a different method of achieving a similar result… I believe the OP was talking about resuming training of the SD model itself. I am also very interested in this especially in light of the Dreambooth paper: https://dreambooth.github.io/ (I think it would be very interesting to try this approach with SD). There’s training code and settings for latent diffusion but I’m not sure if it would be fruitful to try it with stable especially without knowing the training parameters that were used.
Is there any approach to get the source code of dreambooth or will google provide web service about it like midjourney?
I'm also looking for some training code for this repo (either to train from scratch or to fine-tune). Could anyone point me in the right direction?
Since Textual Inversion was already mentioned, it's worth mentioning here that the "Dreambooth" paper technique has been implemented on top of Stable Diffusion (and has advantages in many scenarios where someone might think of finetuning the model directly): https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
Thank you for pointing me to that @janekm! However what I'm looking to do is condition the model on another image. I.e. I want to feed it 2 images (instead of image+text) and use the second image as a condition. I've been thinking of just replacing the CLIP feature with the embedding of the second image instead of the text embedding, but I think this'll require me to actually fine-tune a diffusion model instead of using textual inversion
信已收到,祝你天天快乐.
This repo has been doing "traditional" fine-tuning training on top of stable diffusion, so may have the code that you are looking for (the CompVis repo also has training code in main.py but I've seen reports that it doesn't work out of the box): https://github.com/harubaru/waifu-diffusion (train.sh should be the entry-point)
I want to further train stable-diffusion-v1-4 on my custom dataset. I couldn't find any training script in the repo. Can anyone tell me how can this be accomplished? Is there a training script available so I can resume training?