aki-null / GaussianBlurURP

Simple separable Gaussian blur function for URP
MIT License
46 stars 1 forks source link
cg shaderlab shaders unity

Gaussian Blur for URP

Example output

Optimized Gaussian blur function with configurable sigma and radius.

  1. Play around with sigma and radius values to find the optimal parameters.
  2. Replace the sigma and radius parameters with literal values to let the Unity shader compiler produce efficient code.
  3. #define GAUSSIAN_BLUR_UNROLL 1 before including this file.

Usage

Include this shader file (path may vary):

// #define GAUSSIAN_BLUR_UNROLL 1
#include "Assets/Shaders/GaussianBlur.hlsl"

2 Pass

It is highly recommended to use this variation if you have access to a temporary texture to ping-pong buffer.

The number of texture samples needed for each pass is radius + 1.

For example, a 5x5 kernel has a radius of 2, which means 3 texture samples are needed for each pass.

First pass:

return GaussianBlurHorizontal(_SourceTex, sampler_SourceTex, _SourceTex_TexelSize, i.uv, 1, 3);

Second pass:

return GaussianBlurVertical(_SourceTex, sampler_SourceTex, _SourceTex_TexelSize, i.uv, 1, 3);

1 Pass

This variant is used to perform Gaussian blur in a single pass. This is useful when it is not possible or is too difficult to prepare a temporary texture for the 2 pass approach.

The number of texture samples needed is (radius + 1)^2.

For example, a 5x5 kernel has a radius of 2, which means 9 texture samples are needed in this case.

This is significantly smaller than the naive implementation, which requires (radius * 2 + 1)^2, but still is an O(n^2) algorithm.

return GaussianBlurSingle(_SourceTex, sampler_SourceTex, _SourceTex_TexelSize, i.uv, 1, 2);

Why

I could no longer be bothered writing kernel calculation code on CPU, and pass parameters to the shader for experimenting with sigma and radius, so I wrote this to let the Unity shader compiler generate the code I want.

I also saw various implementations of unoptimized Gaussian blur in many places, which led me to write this.

Notes

Example Code Generated

2 Pass

u_xlat0.x = 0.0;
u_xlat0.y = _SourceTex_TexelSize.y;
u_xlat1 = vec4(-5.35180569, -5.35180569, -3.40398479, -3.40398479) * u_xlat0.xyxy + vs_TEXCOORD0.xyxy;
u_xlat16_2 = texture(_SourceTex, u_xlat1.zw);
u_xlat16_1 = texture(_SourceTex, u_xlat1.xy);
u_xlat2 = u_xlat16_2 * vec4(0.139440298, 0.139440298, 0.139440298, 0.139440298);
u_xlat1 = u_xlat16_1 * vec4(0.0527109653, 0.0527109653, 0.0527109653, 0.0527109653) + u_xlat2;
u_xlat2 = vec4(-1.45842957, -1.45842957, 1.45842957, 1.45842957) * u_xlat0.xyxy + vs_TEXCOORD0.xyxy;
u_xlat0 = vec4(3.40398479, 3.40398479, 5.35180569, 5.35180569) * u_xlat0.xyxy + vs_TEXCOORD0.xyxy;
u_xlat16_3 = texture(_SourceTex, u_xlat2.xy);
u_xlat16_2 = texture(_SourceTex, u_xlat2.zw);
u_xlat1 = u_xlat16_3 * vec4(0.239337295, 0.239337295, 0.239337295, 0.239337295) + u_xlat1;
u_xlat16_3 = texture(_SourceTex, vs_TEXCOORD0.xy);
u_xlat1 = u_xlat16_3 * vec4(0.137022808, 0.137022808, 0.137022808, 0.137022808) + u_xlat1;
u_xlat1 = u_xlat16_2 * vec4(0.239337295, 0.239337295, 0.239337295, 0.239337295) + u_xlat1;
u_xlat16_2 = texture(_SourceTex, u_xlat0.xy);
u_xlat16_0 = texture(_SourceTex, u_xlat0.zw);
u_xlat1 = u_xlat16_2 * vec4(0.139440298, 0.139440298, 0.139440298, 0.139440298) + u_xlat1;
u_xlat0 = u_xlat16_0 * vec4(0.0527109653, 0.0527109653, 0.0527109653, 0.0527109653) + u_xlat1;
SV_Target0 = u_xlat0;

1 Pass

u_xlat0 = _SourceTex_TexelSize.xyxy * vec4(-1.1824255, -1.1824255, 0.0, -1.1824255) + vs_TEXCOORD0.xyxy;
u_xlat16_1 = texture(_SourceTex, u_xlat0.zw);
u_xlat16_0 = texture(_SourceTex, u_xlat0.xy);
u_xlat1 = u_xlat16_1 * vec4(0.12025857, 0.12025857, 0.12025857, 0.12025857);
u_xlat0 = u_xlat16_0 * vec4(0.0892157406, 0.0892157406, 0.0892157406, 0.0892157406) + u_xlat1;
u_xlat1 = _SourceTex_TexelSize.xyxy * vec4(1.1824255, -1.1824255, -1.1824255, 0.0) + vs_TEXCOORD0.xyxy;
u_xlat16_2 = texture(_SourceTex, u_xlat1.xy);
u_xlat16_1 = texture(_SourceTex, u_xlat1.zw);
u_xlat0 = u_xlat16_2 * vec4(0.0892157406, 0.0892157406, 0.0892157406, 0.0892157406) + u_xlat0;
u_xlat0 = u_xlat16_1 * vec4(0.12025857, 0.12025857, 0.12025857, 0.12025857) + u_xlat0;
u_xlat16_1 = texture(_SourceTex, vs_TEXCOORD0.xy);
u_xlat0 = u_xlat16_1 * vec4(0.162102833, 0.162102833, 0.162102833, 0.162102833) + u_xlat0;
u_xlat1 = _SourceTex_TexelSize.xyxy * vec4(1.1824255, 0.0, -1.1824255, 1.1824255) + vs_TEXCOORD0.xyxy;
u_xlat16_2 = texture(_SourceTex, u_xlat1.xy);
u_xlat16_1 = texture(_SourceTex, u_xlat1.zw);
u_xlat0 = u_xlat16_2 * vec4(0.12025857, 0.12025857, 0.12025857, 0.12025857) + u_xlat0;
u_xlat0 = u_xlat16_1 * vec4(0.0892157406, 0.0892157406, 0.0892157406, 0.0892157406) + u_xlat0;
u_xlat1 = _SourceTex_TexelSize.xyxy * vec4(0.0, 1.1824255, 1.1824255, 1.1824255) + vs_TEXCOORD0.xyxy;
u_xlat16_2 = texture(_SourceTex, u_xlat1.xy);
u_xlat16_1 = texture(_SourceTex, u_xlat1.zw);
u_xlat0 = u_xlat16_2 * vec4(0.12025857, 0.12025857, 0.12025857, 0.12025857) + u_xlat0;
u_xlat0 = u_xlat16_1 * vec4(0.0892157406, 0.0892157406, 0.0892157406, 0.0892157406) + u_xlat0;
SV_Target0 = u_xlat0;