carefree0910 / carefree-creator

AI magics meet Infinite draw board.
https://creator.nolibox.com/guest
MIT License
2.13k stars 216 forks source link

What are the minimum ram requirements? #10

Open danielrh opened 1 year ago

danielrh commented 1 year ago

I have an 8 Gig GPU... I suspect it's not enough because I run into

uvicorn apis.interface:app --host 0.0.0.0 --port 8123
ldm_sd_v1.5: 4.27GB [06:19, 11.2MB/s]                                           
ldm_sd_anime_nai: 4.27GB [06:19, 11.3MB/s]                                      
ldm.sd_inpainting: 4.27GB [06:17, 11.3MB/s]                                     
Traceback (most recent call last):
...
  File "/home/danielrh/dev/carefree-creator/cfe9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 662, in _apply
    param_applied = fn(param)
  File "/home/danielrh/dev/carefree-creator/cfe9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 985, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 7.92 GiB total capacity; 7.36 GiB already allocated; 67.56 MiB free; 7.43 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Application startup failed. Exiting.

I was able to run stable diffusion vanilla--does this require additional GPU ram?

carefree0910 commented 1 year ago

Ah, my bad, I should put this in the README as well...

I've provided an option here to exchange GPU RAM with RAM, uncomment this line will first load the models to RAM and then use GPU RAM only when needed!

carefree0910 commented 1 year ago

The reason why this project requires some more GPU RAM than the SD vanilla is that - It actually integrates FOUR different SD versions together, and many other models as well 🤣.

BTW, if you want to focus on the SD vanilla features, you can comment out the following lines, which will also reduce the GPU RAM usage!

danielrh commented 1 year ago

wow so cool! It seems to be loaded now! Thanks for the help! I'm using the OPT because I do want to see the features together, especially all the img2img-related features.

carefree0910 commented 1 year ago

That's great 🥳!

I did not turn on the OPT by default because it eats too much RAM that the Google Colab cannot afford it 🤣.

aleph23 commented 1 year ago

@carefree0910 I have an 8GB GPU (RTX2070) & 16 GB RAM. At launch with '--lazy' argument, I have 12.3 GB RAM available and 7.5 GB GPU ram. GPU ram increases to around 6500 MG used (as reported by NVIDIA Inspector) and I then get:

lib\site-packages\torch\serialization.py", line 1112, in load_tensor
    storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 3276800 bytes.

Application startup failed. Exiting.

There is minimal usage of CPU RAM during this process. Automatic1111 with several extensions runs fine. Any suggestions as to why it seems CPU RAM isn't been used? T

carefree0910 commented 1 year ago

@aleph23 Hi! This project has one major difference from the Automatic1111: it launches MANY models at the same time, so it will eat up much more resources.

There is a workaround though:

cfcreator serve --limit 1

Which means you'll only load 1 model and leave everything else on disk. (In this case, it'll perform more alike to the Automatic1111!)

However, in my personal experience I found that there are some memory leaks around. I'm currently using gc.collect() and maybe I left some references to the models which stops Python from freeing the memory.

usmanyousaaf commented 10 months ago

*how to run this bro :)

carefree0910 commented 10 months ago

*how to run this bro :)

The Goole Colab should be working now!