the approximation formula for screen-space derivatives duv/dxy

iszihan commented 2 years ago

Hello, I'm looking into the geometry shader code that outputs the screen-space barycentric derivatives, and am a little confused by the formulation. I'm wondering if anyone would be able to clarify this mathematical approximation a little bit? Some questions I had are

why do we multiply by w when computing the area ?
Why do we need to define u in the u/w / 1/w way to derive it? In this case, why is d(u/w)/dX constant -- shouldn't u change with X?
I am mainly puzzled by the formulation of the following, and how they eventually lead to the formulation of duv/dxy for each vertex.
float duwdx = w2 dudx; float dvwdx = w2 dvdx; float duvdx = w0 dudx + w1 dvdx; float duwdy = w2 dudy; float dvwdy = w2 dvdy; float duvdy = w0 dudy + w1 dvdy; ` // Outputs: // var_uvzw: (u,v,z,w) // var_db: (du/dx,du/dy,dv/dx,dv/dy)

// Set up geometry shader. Calculation of per-pixel bary differentials is based on: // u = (u/w) / (1/w) // --> du/dX = d((u/w) / (1/w))/dX // --> du/dX = [d(u/w)/dX - ud(1/w)/dX] w // and we know both d(u/w)/dX and d(1/w)/dX are constant over triangle.compileGLShader(NVDR_CTX_PARAMS, &s.glGeometryShader, GL_GEOMETRY_SHADER,

 "#version 430\n"
        STRINGIFY_SHADER_SOURCE(
            layout(triangles) in;
            layout(triangle_strip, max_vertices=3) out;
            layout(location = 0) uniform vec2 vp_scale;
            in int v_layer[];
            in int v_offset[];
            out vec4 var_uvzw;
            out vec4 var_db;
            void main()
            {
                // Plane equations for bary differentials.
                float w0 = gl_in[0].gl_Position.w;
                float w1 = gl_in[1].gl_Position.w;
                float w2 = gl_in[2].gl_Position.w;
                vec2 p0 = gl_in[0].gl_Position.xy;
                vec2 p1 = gl_in[1].gl_Position.xy;
                vec2 p2 = gl_in[2].gl_Position.xy;
                vec2 e0 = p0*w2 - p2*w0;
                vec2 e1 = p1*w2 - p2*w1;
                float a = e0.x*e1.y - e0.y*e1.x; 

                // Clamp area to an epsilon to avoid arbitrarily high bary differentials.
                float eps = 1e-6f; // ~1 pixel in 1k x 1k image.
                float ca = (abs(a) >= eps) ? a : (a < 0.f) ? -eps : eps; // Clamp with sign.
                float ia = 1.f / ca; // Inverse area.

                vec2 ascl = ia * vp_scale;

                float dudx =  e1.y * ascl.x; 
                float dudy = -e1.x * ascl.y;
                float dvdx = -e0.y * ascl.x; 
                float dvdy =  e0.x * ascl.y;

                float duwdx = w2 * dudx; 
                float dvwdx = w2 * dvdx;
                float duvdx = w0 * dudx + w1 * dvdx;
                float duwdy = w2 * dudy;
                float dvwdy = w2 * dvdy;
                float duvdy = w0 * dudy + w1 * dvdy;

                vec4 db0 = vec4(duvdx - dvwdx, duvdy - dvwdy, dvwdx, dvwdy);
                vec4 db1 = vec4(duwdx, duwdy, duvdx - duwdx, duvdy - duwdy);
                vec4 db2 = vec4(duwdx, duwdy, dvwdx, dvwdy);

                int layer_id = v_layer[0];
                int prim_id = gl_PrimitiveIDIn + v_offset[0];

                gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[0].gl_Position.x, gl_in[0].gl_Position.y, gl_in[0].gl_Position.z, gl_in[0].gl_Position.w); var_uvzw = vec4(1.f, 0.f, gl_in[0].gl_Position.z, gl_in[0].gl_Position.w); var_db = db0; EmitVertex();
                gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[1].gl_Position.x, gl_in[1].gl_Position.y, gl_in[1].gl_Position.z, gl_in[1].gl_Position.w); var_uvzw = vec4(0.f, 1.f, gl_in[1].gl_Position.z, gl_in[1].gl_Position.w); var_db = db1; EmitVertex();
                gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[2].gl_Position.x, gl_in[2].gl_Position.y, gl_in[2].gl_Position.z, gl_in[2].gl_Position.w); var_uvzw = vec4(0.f, 0.f, gl_in[2].gl_Position.z, gl_in[2].gl_Position.w); var_db = db2; EmitVertex();
            }
        )
    );

`

s-laine commented 2 years ago

Hi @iszihan. This computation isn't really approximative, but it yields the exact analytic screen-space derivatives of the barycentrics that are sampled at pixel centers at the fragment processing stage. The only approximation is clamping the area before taking the reciprocal to prevent arbitrarily large values near the horizon, as they will in turn produce arbitrarily large gradients that aren't great for optimization algorithms.

1 & 3. The intermediate variables are scaled by different combinations of w0, w1, w2 to avoid unnecessary divisions. This is not essential; you could start by dividing each point's (x, y) coordinates by the corresponding w to get NDC coordinates, compute area in that scale, and continue from there, but you will need a bunch of divisions by w at the end too. All these divisions can be eliminated by keeping track of how much of each w is implicitly multiplied into the intermediate values. For example, e0 has a scale of w0*w2 compared to the viewport-space value, a has a scale of w0*w1*w2*w2, etc., and ultimately duwdx and friends will end up with scales of 1/w0, 1/w1 and 1/w2. This is presumably a bit more efficient and precise than doing several divisions, although I haven't done any comparisons. The more readable approach would probably work just as well.

2. After perspective projection of a plane in 3D, any affine function f on the plane will be rational in screen-space in a way that f/w is precisely linear in screen space. From this it follows that u/w and 1/w are linear in screen space. I unfortunately don't have a great reference for this - in Paul Heckbert's master's thesis on texture mapping it is exploited in Section 2.3.5 and there may be a justification somewhere earlier in the thesis. There also appears to be a related derivation here in the context of vertex attribute interpolation.

iszihan commented 2 years ago

Thank you so much for the detailed response and references, would not have known this perspective correct interpolation otherwise! I'm trying to work out the formula for p2, and am still quite stuck of getting the formula in the shader. If you have time, would it be possible to write out the d(u/w) / d(1/w) formula w.r.t this pre-multiply by w's framework just for one of the vertices?

More detailed question for p2, du/dx = (d(u/w)/dx - u d(1/w)/dx) w2 = d(u/w)/dx w2, In the shader the output is set to the duwdx = dudx w2, yet shouldn't dudx have a ratio of 1/w0w2 (as e1 has a multiple of w1w2 and 1/a has a ratio of 1/(w0w1w2w2), and multiplying by w2 would still leave a ratio of 1/w0 here.

Please let me know if I am thinking of this in a wrong way. Thank you in advance!

s-laine commented 2 years ago

Here's the geometry shader code with divisions by w done at the beginning so that the edge functions and area are computed in NDC space. I have checked that this gives the same output as the code in the repo (modulo floating point rounding), at least for triangles that don't clip the camera plane, so hopefully this helps you forward.

void main()
{
    // Plane equations for bary differentials.
    float w0 = gl_in[0].gl_Position.w;
    float w1 = gl_in[1].gl_Position.w;
    float w2 = gl_in[2].gl_Position.w;
    vec2 p0 = gl_in[0].gl_Position.xy / w0;
    vec2 p1 = gl_in[1].gl_Position.xy / w1;
    vec2 p2 = gl_in[2].gl_Position.xy / w2;
    vec2 e0 = p0 - p2;
    vec2 e1 = p1 - p2;
    float a = e0.x*e1.y - e0.y*e1.x;

    // Clamp area to an epsilon to avoid arbitrarily high bary differentials.
    float eps = 1e-6f; // ~1 pixel in 1k x 1k image.
    float ca = (abs(a) >= eps) ? a : (a < 0.f) ? -eps : eps; // Clamp with sign.
    float ia = 1.f / ca; // Inverse area.

    vec2 ascl = ia * vp_scale;
    float dudx =  e1.y * ascl.x;
    float dudy = -e1.x * ascl.y;
    float dvdx = -e0.y * ascl.x;
    float dvdy =  e0.x * ascl.y;

    float duwdx = dudx / w0;
    float dvwdx = dvdx / w1;
    float duvdx = (dudx + dvdx) / w2;
    float duwdy = dudy / w0;
    float dvwdy = dvdy / w1;
    float duvdy = (dudy + dvdy) / w2;

    vec4 db0 = vec4(duvdx - dvwdx, duvdy - dvwdy, dvwdx, dvwdy);
    vec4 db1 = vec4(duwdx, duwdy, duvdx - duwdx, duvdy - duwdy);
    vec4 db2 = vec4(duwdx, duwdy, dvwdx, dvwdy);

    int layer_id = v_layer[0];
    int prim_id = gl_PrimitiveIDIn + v_offset[0];

    gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[0].gl_Position.x, gl_in[0].gl_Position.y, gl_in[0].gl_Position.z, gl_in[0].gl_Position.w); var_uvzw = vec4(1.f, 0.f, gl_in[0].gl_Position.z, gl_in[0].gl_Position.w); var_db = db0; EmitVertex();
    gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[1].gl_Position.x, gl_in[1].gl_Position.y, gl_in[1].gl_Position.z, gl_in[1].gl_Position.w); var_uvzw = vec4(0.f, 1.f, gl_in[1].gl_Position.z, gl_in[1].gl_Position.w); var_db = db1; EmitVertex();
    gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[2].gl_Position.x, gl_in[2].gl_Position.y, gl_in[2].gl_Position.z, gl_in[2].gl_Position.w); var_uvzw = vec4(0.f, 0.f, gl_in[2].gl_Position.z, gl_in[2].gl_Position.w); var_db = db2; EmitVertex();
}

Also, thanks to this exercise I noted that the clamping threshold is somewhat broken in the released version, as the 1e-6f epsilon assumes that the area is computed in NDC space (as is it here). I think I'll have to correct this in the next release.

iszihan commented 2 years ago

This clarifies things a lot. Thank you!!

minxuanjun commented 2 years ago

This clarifies things a lot. Thank you!! @s-laine Hello, I have difficulties understanding the following code about the barycentric differential interpolation,
" vec4 db0 = vec4(duvdx - dvwdx, duvdy - dvwdy, dvwdx, dvwdy); vec4 db1 = vec4(duwdx, duwdy, duvdx - duwdx, duvdy - duwdy); vec4 db2 = vec4(duwdx, duwdy, dvwdx, dvwdy); " could you help me out? Thank you!

NVlabs / nvdiffrast

the approximation formula for screen-space derivatives duv/dxy #49