Issue when merging some models

Delcos commented 2 years ago

Getting this error only when merging some models, works fine with others.

Or press [ENTER] for default [merged]: Traceback (most recent call last): File "C:\Users\Devon\Documents\Stable Diffusion Super\merge-models-main\merge.py", line 15, in model_1 = torch.load(args.model_1) File "C:\Users\Devon\Documents\Stable Diffusion Super\venv\lib\site-packages\torch\serialization.py", line 712, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "C:\Users\Devon\Documents\Stable Diffusion Super\venv\lib\site-packages\torch\serialization.py", line 1049, in _load result = unpickler.load() File "C:\Users\Devon\Documents\Stable Diffusion Super\venv\lib\site-packages\torch\serialization.py", line 1019, in persistent_load load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) File "C:\Users\Devon\Documents\Stable Diffusion Super\venv\lib\site-packages\torch\serialization.py", line 1001, in load_tensor wrap_storage=restore_location(storage, location), File "C:\Users\Devon\Documents\Stable Diffusion Super\venv\lib\site-packages\torch\serialization.py", line 175, in default_restore_location result = fn(storage, location) File "C:\Users\Devon\Documents\Stable Diffusion Super\venv\lib\site-packages\torch\serialization.py", line 157, in _cuda_deserialize return obj.cuda(device) File "C:\Users\Devon\Documents\Stable Diffusion Super\venv\lib\site-packages\torch_utils.py", line 78, in _cuda return torch.UntypedStorage(self.size(), device=torch.device('cuda')).copy(self, non_blocking) RuntimeError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 11.00 GiB total capacity; 9.44 GiB already allocated; 0 bytes free; 9.64 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Press any key to continue . . .

eyriewow commented 2 years ago

The CUDA error seems to indicate that you ran out of VRAM while merging. That's curious, since for me, the models are always loaded into regular system RAM.

Could you please try restarting your computer and run it again after a fresh boot?

If that doesn't work, could you provide a few more details, like:

The models that failed to merge and their file size (the file's hash would be nice, so I can try to replicate it exactly)
The models that you were able to merge and their file size (hash would be nice again)
Your Graphics card model
If you are not using Automatic1111's Web UI: which Web UI you are using

I have my suspicion as to what happened here. If I'm right, I should be able to implement a fix fairly quickly.

eyriewow commented 2 years ago

I have pushed an update that should resolve your issue.

It seems like I didn't take default behavior of torch into consideration and relied on specific environment setups - oops.
So in your case, the script would try to load the models into VRAM and use CUDA to merge. That's great if you have a ton of VRAM, but with the model sizes we are dealing with that is going to fail for most consumer cards.

I have changed the default behavior to make sure to handle the merging on the CPU and load to system RAM. That's a bit slower, but for most users that will be the safer option.

Please give the updated version a try and report back if it worked for you. Also, thanks for taking the time to report the issue.

eyriewow / merge-models

Issue when merging some models #2