BrutPitt / glslSmartDeNoise

Fast glsl deNoise spatial filter, with circular gaussian kernel, full configurable
BSD 2-Clause "Simplified" License
254 stars 26 forks source link

Is this shader separable? #3

Open bschwind opened 3 years ago

bschwind commented 3 years ago

I noticed this shader is quite similar to a gaussian blur, which is a separable filter (blurring on just the X axis, and then blurring that image on just the Y axis gives the same results as doing a full convolution).

Do you think this could be turned into a separable version? I'm attempting that now, but I may have my math incorrect and I'm not 100% sure if it's even possible. If I come up with something I'll be sure to submit a PR, but I'm curious if you've already evaluated this.

BrutPitt commented 3 years ago

Hi. Thanks for this observation Before reply to you, I wanted to review my experiments with separability attempts, and for that I apologize for the delay in answering.

My attempts to find separability have not been successful, but it doesn't mean it can't be separable. I think that the trouble is the "walking" pixel (related to the central pixel ) in the sigma "area" (spatial):

            vec4 walkPx =  texture(origTexture, vTexCoord+d*invScreenSize);
            vec4 dC = walkPx-centrPx;
            float deltaFactor = exp( -dot(dC, dC) * invBSigmaSqx2) * invBSigmaxSqrt2PI * blurFactor;

But I would be glad if you could find a solution to this.

Some my attempts at separability create this effect: Screenshot at 2021-11-04 01-48-22

... or this: Screenshot at 2021-11-04 01-49-11

... depends if I start with horizontal or vertical pass.

This was my better (and correct?) attempt: Screenshot at 2021-11-05 01-10-20

This my separable function, relative to the image above (left: original, right: processed):

// used multi-draw Framebuffer
layout (location = 0) out vec4 color;
layout (location = 1) out vec4 auxThres;
layout (location = 2) out vec4 auxDelta; // not used in this example

layout (binding=1) uniform sampler2D imageData; // main picture
layout (binding=2) uniform sampler2D blurPass;    // color output (1pass)
layout (binding=3) uniform sampler2D thresPass;  // auxThres output

in vec2 vTexCoord; // pass vertex coords [0,1] from vertexshader

// same that in denoise
uniform float uSigma;
uniform float uThreshold;
uniform float uSlider;
uniform float uKSigma;
uniform vec2 wSize;

uniform int pass; // pass number

#define INV_SQRT_OF_2PI 0.39894228040143267793994605993439  // 1.0/SQRT_OF_2PI
#define INV_PI 0.31830988618379067153776752674503

// blurPass is color output of pass 1
void gPassC(sampler2D tex, vec2 direction, float sigma, float kSigma, float threshold, int pass)
{
    //compute the radius across the kernel
    float accumBlur   = pass == 1 ? 0.0 : texture(blurPass,vTexCoord).a;
    float accumThres = pass == 1 ? 0.0 : texture(thresPass,vTexCoord).r;
    //vec4 accumBlur   = vec4(0.0);
    vec4 accumBuff  = pass == 1 ? vec4(0.0) : texture(blurPass,vTexCoord);

    float radius = kSigma*sigma;
    float radQ = radius * radius;

    float invSigmaQx2 = .5 / (sigma * sigma);      // 1.0 / (sigma^2 * 2.0)
    float invSigmaQx2PI = INV_PI * invSigmaQx2;    // // 1/(2 * PI * sigma^2)

    float invThresholdSqx2 = .5 / (threshold * threshold);     // 1.0 / (sigma^2 * 2.0)
    float invThresholdSqrt2PI = INV_SQRT_OF_2PI / threshold;   // 1.0 / (sqrt(2*PI) * sigma)

    vec4 centrPx = texture(imageData,vTexCoord);
    vec2 invScreenSize = 1.0/vec2(textureSize(imageData, 0));

    // separable Gaussian
    for( float r = -radius; r <= radius; r++) {
        vec2 dir = r * direction * invScreenSize;
        float blurFactor = exp( -(r*r) * invSigmaQx2 ) * invSigmaQx2PI;

        vec4 walkPx =  texture(imageData,vTexCoord+dir);
        vec4 walkPxBlur =  texture(tex,vTexCoord+dir);

        vec4 dC = walkPx-centrPx;
        float deltaFactor = exp( -dot(dC, dC) * invThresholdSqx2) * invThresholdSqrt2PI * blurFactor;

        accumBlur  +=  blurFactor;
        accumBuff  +=  deltaFactor*walkPx;
        accumThres +=  deltaFactor;
    }
    auxThres   = vec4(accumThres);
    color      = vec4(accumBuff.rgb, accumBlur); 
} 
//some code is redundant from various previous attempts, and not optimized (of course)

Calls :

// Pass1
            gPassC(imageData, vec2(0.0, 1.0),  uSigma, uKSigma, uThreshold, 1);
//Pass2
            gPassC(thresPass, vec2(1.0, 0.0),  uSigma, uKSigma, uThreshold, 2);

            color = vec4((color.rgb) / auxThres.r, 1.0);

I hope this can help you streamline your work/attempts

bschwind commented 3 years ago

Hi @BrutPitt! Thanks for the investigation into this. It's looking close to the correct version, though it gives an interesting "fabric" esque pattern, not quite the desired result.

For some reason I didn't even think to do both separable passes in one shader call, that makes a lot more sense. I'm used to doing gaussian blurs as two separate draw calls so I started with that, but ran into trouble preserving the accumulator buffers in intermediate textures. I'll try your approach with the code I currently have and see what comes of it.

I'll have more time to dedicate to this next week, but I'll update you if I come up with anything good!

BrutPitt commented 3 years ago

Hi again, @bschwind.

I'm sorry, maybe I took for granted the use of multiple steps in shaders, and I wasn't clear enough synthesizing the my calls, but my main() func is this:

void main(void)
{
    float slide = uSlider *.5 + .5;
    float szSlide = .001;
    vec2 uv = vec2(gl_FragCoord.xy / wSize);

    vec4 c;

    switch(pass) {
        case 1:
            gPassC(imageData, vec2(0.0, 1.0),  uSigma, uKSigma, uThreshold, 1);
            break;
        case 2:
            gPassC(thresPass, vec2(1.0, 0.0),  uSigma, uKSigma, uThreshold, 2);
            color = vec4((color.rgb) / auxThres.r, 1.0);
            break;
    }
    color = uv.x<slide-szSlide  ? texture(imageData, vec2(uv.x,uv.y)) : (uv.x>slide+szSlide ? color :  vec4(1.0));
}

Obviously I call the shader twice with different pass value (first = 1, second = 2)

Still about the separability and the "pattern" on image, I thing that the trouble is just the "walking" pixel (as I wrote), or better the difference from "walking" and central pixel: it needs to be calculated in a spatial (circular or square) area, but with the separability we get only the two main lines (horizontal and vertical)... hence the "cross" pattern on the last my image.

I'm not a theoretical mathematician, just a software engineer, so to be sure of my sensation, I tried to implement it anyway (unsuccessfully, as you have seen). Me too not 100% sure of this my statement, and I do not rule out that there may be a method for doing this, maybe "non-canonical".

bschwind commented 2 years ago

Thanks again for the starter attempts! I still haven't had time to really take a crack at this but I'll be sure to update the thread when I do!