nicolai256 / Stable-textual-inversion_win

MIT License
241 stars 43 forks source link

google colab kicked me out due to gpu usage #8

Closed DrakeFruit closed 1 year ago

DrakeFruit commented 1 year ago

after running for 4 hours it kicked me out and wont let me run the colab, this also deleted all my progress in the ckpt file... any way to recover that?

nicolai256 commented 1 year ago

no if your colab stops it stops, one of the downsides of colab

DrakeFruit commented 1 year ago

how do I save the files to my drive and run it from there

nicolai256 commented 1 year ago

put the repo in ur drive and modify the commands to use the repo in the drive

DrakeFruit commented 1 year ago

its been running for an hour on my drive but im not seeing any progress images in logs/imagesfortraining/images/train

affanmehmood commented 1 year ago

How often does the code save the checkpoint files?

nicolai256 commented 1 year ago

it saves when u cancel the training

nicolai256 commented 1 year ago

intermediate saving still has to be fixed, for now it saves every epoch tho

affanmehmood commented 1 year ago

This is the issue I'm facing when I try to resume the training from last.ckpt. Why is this happening?

Global seed set to 23 Running on GPUs 1 Loading model from /content/drive/MyDrive/---/checkpoints/last.ckpt Traceback (most recent call last): File "/content/Stable-textual-inversion_win/main.py", line 604, in model = load_model_from_config(config, opt.actual_resume) File "/content/Stable-textual-inversion_win/main.py", line 37, in load_model_from_config sd = pl_sd["state_dict"] KeyError: 'state_dict'

nicolai256 commented 1 year ago

Resume doesn't work yet