Sygil-Dev / stable-diffusion

GNU Affero General Public License v3.0
1.72k stars 149 forks source link

Partial copy of Doggettx's minimal memory requirement improvements #286

Closed swfsql closed 2 years ago

swfsql commented 2 years ago

I removed the mask part because idk how to adapt it to the change.

Reference used: https://github.com/Doggettx/stable-diffusion/blob/8283bb5b84580487e7a9e25c37816484bf4ed42b/ldm/modules/attention.py#L170

For squared sizes, I could only get to 384x384, and with this change I can get to 704x704.

TingTingin commented 2 years ago

how is this compared to optimized?

TingTingin commented 2 years ago

also is it possible to set a memory limit at a certain percentage below max mem or a set number ive noticed that generation and the entire pc can slow down when your close to your cards max vram also makes recording videos using obs very choppy is it possible to set a max limit to ensure your always under at the cost of generation speed?

AscendedGravity commented 2 years ago

We'll want to make sure to check for conflicts or scrap https://github.com/hlky/stable-diffusion/pull/262 when the time comes.

swfsql commented 2 years ago

@TingTingin I don't know how the libraries work so I don't really know. I myself just copied the other guy's changes, but I still believe that it can be optimized even further.. maybe we could render something as big as we wanted or something like that.

And yes I do believe that improving this even further would enable to set some memory caps, which would be pretty neat.

AscendedGravity commented 2 years ago

I get around the same results with this PR compared to my testing on #262.

This PR - Took 79.4s total (79.4s per image) Peak memory usage: 6128 MiB / 8192 MiB / 74.803%

PR 262 - Took 77.45s total (77.45s per image) Peak memory usage: 5941 MiB / 8192 MiB / 72.521%

I haven't looked through to compare changes so I couldn't say which is best.