AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
135.03k stars 25.8k forks source link

Implement the g-diffuser in/outpainting methods #309

Closed ilikenwf closed 1 year ago

ilikenwf commented 1 year ago

Current Implementation by @parlance-zz https://github.com/parlance-zz/g-diffuser-bot/tree/g-diffuser-bot-diffuserslib-beta

Examples: https://old.reddit.com/r/StableDiffusion/comments/xbjnnu/huge_outpainting_in_1_step_without_erased_colors/

The Author's Description:

Explanation: Getting good results in/out-painting with stable diffusion can be challenging. Although there are simpler effective solutions for in-painting, out-painting can be especially challenging because there is no color data in the masked area to help prompt the generator. Ideally, even for in-painting we'd like work effectively without that data as well. Provided here is my take on a potential solution to this problem.

By taking a fourier transform of the masked src img we get a function that tells us the presence and orientation of each feature scale in the unmasked src. Shaping the init/seed noise for in/outpainting to the same distribution of feature scales, orientations, and positions increases output coherence by helping keep features aligned. This technique is applicable to any continuous generation task such as audio or video, each of which can be conceptualized as a series of out-painting steps where the last half of the input "frame" is erased. For multi-channel data such as color or stereo sound the "color tone" or histogram of the seed noise can be matched to improve quality (using scikit-image currently) This method is quite robust and has the added benefit of being fast independently of the size of the out-painted area. The effects of this method include things like helping the generator integrate the pre-existing view distance and camera angle.

ilikenwf commented 1 year ago

This and #301 are duplicates, I'll close this one.