lucidrains / big-sleep

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
MIT License
2.57k stars 304 forks source link

Memory is not freed when process is cancelled #115

Open vectris-dev opened 3 years ago

vectris-dev commented 3 years ago

If I interrupt the process in windows using CTRL+C and then attempt to run it again I get the following error

Imagining "Galaxy_of_ghosts" ... c:\programdata\anaconda3\lib\site-packages\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at ..\c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) loss: -13.45: 0%| | 1/1050 [00:01<17:40, 1.01s/it] epochs: 0%| | 0/20 [00:01<?, ?it/s] Traceback (most recent call last): | 0/420.0 [00:00<?, ?it/s] File "c:\programdata\anaconda3\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\programdata\anaconda3\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\ProgramData\Anaconda3\Scripts\dream.exe\__main__.py", line 7, in <module> File "c:\programdata\anaconda3\lib\site-packages\big_sleep\cli.py", line 74, in main fire.Fire(train) File "c:\programdata\anaconda3\lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "c:\programdata\anaconda3\lib\site-packages\fire\core.py", line 466, in _Fire component, remaining_args = _CallAndUpdateTrace( File "c:\programdata\anaconda3\lib\site-packages\fire\core.py", line 681, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\cli.py", line 71, in train imagine() File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\big_sleep.py", line 499, in forward out, loss = self.train_step(epoch, i, image_pbar) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\big_sleep.py", line 447, in train_step out, losses = self.model(self.encoded_texts["max"], self.encoded_texts["min"]) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\big_sleep.py", line 260, in forward image_embed = perceptor.encode_image(into) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 519, in encode_image return self.visual(image.type(self.dtype)) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 410, in forward x = self.transformer(x) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 381, in forward return self.resblocks(x) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\container.py", line 139, in forward input = module(input) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 368, in forward x = x + self.attention(self.ln_1(x)) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 340, in forward ret = super().forward(x.type(torch.float32)) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\normalization.py", line 173, in forward return F.layer_norm( File "c:\programdata\anaconda3\lib\site-packages\torch\nn\functional.py", line 2346, in layer_norm return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity; 5.18 GiB already allocated; 3.44 MiB free; 5.36 GiB reserved in total by PyTorch) image update: 0%| | 0/420.0 [00:01<?, ?it/s]

Shouldn't the memory have been freed up when I cancelled the previous process? This is after a fresh restart with nothing else running.