TheLastBen / fast-stable-diffusion

fast-stable-diffusion + DreamBooth
MIT License
7.46k stars 1.29k forks source link

Run Google Colab Dreambooth training on Vast.ai or Runpod.io #518

Open bach777 opened 1 year ago

bach777 commented 1 year ago

Due to the limits of google colab free tier, it is not viable the training in resolutions higher than +768x768, or even in 512x512, I have understood that it requires many steps to obtain impeccable results, so I was wondering if there is a way to use your google colab template in a gpu rental page, or to do it in a local Colab session but connected to a rented gpu. Is it possible? Thank you very much for your kind assistance!

GuruVirus commented 1 year ago

Yes. AItrepreneur has plenty of Runpod tutorials. https://www.youtube.com/c/Aitrepreneur/videos

wktra commented 1 year ago

Runpod costs an arm and a leg, and that's if a decent client is available. Google colab costs $10 a month.

bach777 commented 1 year ago

the price of a RTX A5000 is $0.49/hour on demand

wktra commented 1 year ago

Each training takes me about 4 hours for actual quality. That's about $2 right there. And then there are the disk charges ($4/mo Disk Charge). I train and tweak a few times a day. In less than a week, I'll be paying out the nose.

bach777 commented 1 year ago

Thank you very much for the info, I will try to do it on Google Colab with the monthly tier. I hope they accept payments from my country

wktra commented 1 year ago

Thank you very much for the info, I will try to do it on Google Colab with the monthly tier. I hope they accept payments from my country

I gotta be honest, I had trouble with my USA card when I tried to subscribe. I had to call my bank, tell them to not reject google colab and then try again.

bach777 commented 1 year ago

I have trained a model (3000 steps) with "Continue training" , checked, and "Enable_text_encoder_training:" unchecked, two models named A1_step_1000.ckpt and A1_step_2000.ckpt are on my hard disk, It takes steps 4000-5000 steps to complete the training, right? . When I try to load the session, I get the message "Previous model not found, training a new model...". And it starts the training from the beginning.... What should I do?

GuruVirus commented 1 year ago

Just a warning, an acquaintance tested colab premium ($50) and Dreambooth would not run on the the provided hardware.

As far as restarting a session, is your session name the same? image

100 steps per image worked well with SD1.4, but 1.5 I haven't heard of people having good results worth reproducing.

bach777 commented 1 year ago

Screenshot (380) Screenshot (381) I referred as A1 as an example, this is the model that I try to train, apparently everything is correct but the training is restarting from 0

TheLastBen commented 1 year ago

The training is restarting because there is no final ckpt, only the intermediary checkpoint, if you want to resume anyway, rename one of the models to "sora.ckpt"

emidio90 commented 1 year ago

I'm having problem running it on Paperspace, it's just stuck at cloning the repo

0xdevalias commented 1 year ago

See also:

SU1199 commented 1 year ago

idk if I'm too late but check this out https://github.com/SU1199/fastBooth It's has all the performance modifies from the shivam's and theleastben notebooks with xformers.

Shadhil24 commented 1 year ago

I tried running ShivamShivrao and ThelastBen on runpod and vast ai. Training in working fine but the model was not able to generate user given images after training. It is working fine in colab. What might be the reason. It would be a great help if anyone coult help me with this. Thank You @SU1199

TheLastBen commented 1 year ago

@Shadhil24 I made a template for Runpod https://www.runpod.io/console/gpu-secure-cloud?template=runpod-stable-unified

Shadhil24 commented 1 year ago

Can i run this on vast ai, sorry if this is a stupid question, i am new to this

On Tue, Mar 28, 2023 at 12:05 PM Ben @.***> wrote:

@Shadhil24 https://github.com/Shadhil24 I made a template for Runpod https://www.runpod.io/console/gpu-secure-cloud?template=runpod-stable-unified

— Reply to this email directly, view it on GitHub https://github.com/TheLastBen/fast-stable-diffusion/issues/518#issuecomment-1486293354, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOZN5GBGYUZE2LRL5PM2IDTW6KBEBANCNFSM6AAAAAASBWDP6U . You are receiving this because you were mentioned.Message ID: @.***>

TheLastBen commented 1 year ago

The template is designed for runpod

Shadhil24 commented 1 year ago

Is it necessary to given name for the images as same as the instance images

On Tue, Mar 28, 2023 at 12:34 PM Shadhil Siraj < @.***> wrote:

Can i run this on vast ai, sorry if this is a stupid question, i am new to this

On Tue, Mar 28, 2023 at 12:05 PM Ben @.***> wrote:

@Shadhil24 https://github.com/Shadhil24 I made a template for Runpod https://www.runpod.io/console/gpu-secure-cloud?template=runpod-stable-unified

— Reply to this email directly, view it on GitHub https://github.com/TheLastBen/fast-stable-diffusion/issues/518#issuecomment-1486293354, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOZN5GBGYUZE2LRL5PM2IDTW6KBEBANCNFSM6AAAAAASBWDP6U . You are receiving this because you were mentioned.Message ID: @.***>

TheLastBen commented 1 year ago

yes, the instance name is determined by the images filenames

Shadhil24 commented 1 year ago

Sorry for asking more questions, but how cam i give the same name to every images, should i give count like shadhil_1.png, shadhil_2.png, shadhil_3.png like that?

On Tue, Mar 28, 2023 at 3:22 PM Ben @.***> wrote:

yes, the instance name is determined by the images filenames

— Reply to this email directly, view it on GitHub https://github.com/TheLastBen/fast-stable-diffusion/issues/518#issuecomment-1486552010, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOZN5GDWMI4ENDGXAT7UWS3W6KYGFANCNFSM6AAAAAASBWDP6U . You are receiving this because you were mentioned.Message ID: @.***>

TheLastBen commented 1 year ago

yes, that's a correct format, but don't use known words, use a random token like "bvhrghc"

Shadhil24 commented 1 year ago

@TheLastBen Thank you, Its working fine now. I have changes my image names with the instance and class names

TheLastBen commented 1 year ago

don't use a class name though, only the token

MohammadKatif commented 2 months ago

How to use runpod with colab?