Closed Evilu closed 2 years ago
Just tried basujindal optimized img2img and it worked with --n_samples.
So a) we really want to implement it here. b) we need to check why img2img is much more expensive then text2img.
whats the size of the image itself?
This is a bit concerning. There is active working going on now with respect to the effect on reducing memory utilisation if you clear the CUDA cache frequently, and I have my fingers crossed that this will reduce memory requirements further. You might want to try checking out pull request #122 to see if there is an improvement:
In the stable-diffusion director:
git checkout -b BaristaLabs-clear-cuda-cache-after-each-image main
git pull https://github.com/BaristaLabs/stable-diffusion-dream.git clear-cuda-cache-after-each-image
conda env update -f environment.yaml # just in case
After testing, you can switch back to main branch with "git checkout main"
I believe this is a regression, large images are being read and using those dimensions, it attempts to generate images. There should be a warning or a constraint on the -I if the input image is above a certain size - additionally, it doesn't appear that specifying -W/-H are passed
~sampler 'k_lms' is not yet supported. Using DDM sampler
from your output indicates that you're using a (relatively :D) old release.~
~You should probably update to at least 1.09~
ugh derp
Possibly adding an option to resize large image as input for img2img... maybe use the -H and -W as the resize values? Resize to those when specified? This will take care of using an fpgan input images with a small GPU that can only handle 512x512
@tildebyte k_lms doesn't support img2img, so that warning is expected even in latest. The consensus is that since the sampler has less of a role in img2img to some extent, it's less of a priority
@mbaltais yeah, exactly
For those who asked, image is 976x850
650 is not divisible by 64... Could be part of your problem
I just completed an extensive benchmarking of VRAM usage and think that things will be better now that pull request #162 is in. Note that there is still a bug when applying face touchup and upscaling to images generated using the batch option (e.g. -b2): only the last image in the batch is touched up. However, most people won't have enough VRAM to run more than 1 image per batch, so I thought it would be OK to release.
Ok, so iv'e been working with this fork fine in the last few days. prompt with default s50 -b1 -W512 -H512 -C7.5 -mk_lms working just fine with my 6gb dedicated GPU until now, but IMG2IMG isn't.
dream> "Regular Frog" --init_img=./init-images/pepe.jpg --strength=0.5 -s100 -n4 Sampling: 0%| | 0/4 [00:00<?, ?it/s]sampler 'k_lms' is not yet supported. Using DDM sampler loaded input image of size (976, 850) from ./init-images/pepe.jpg Sampling: 0%| | 0/4 [00:00<?, ?it/s] CUDA out of memory. Tried to allocate 390.00 MiB (GPU 0; 6.00 GiB total capacity; 3.17 GiB already allocated; 0 bytes free; 4.72 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Are you sure your system has an adequate NVIDIA GPU? 0 images generated in 0.10s Outputs:
Any ideas why? doIMG2IMG just needs more GPU juice?
EDIT: btw, iv'e also tried really tiny render,
dream> "Regular Frog" --init_img=./init-images/pepe.jpg -W5 -H5
dream returns the same message