AcademySoftwareFoundation / OpenShadingLanguage

Advanced shading language for production GI renderers
BSD 3-Clause "New" or "Revised" License
2.09k stars 356 forks source link

Implement new oren nayar model from OpenPBR #1817

Closed fpsunflower closed 4 months ago

fpsunflower commented 4 months ago

Description

This implements the "EON" energy conserving flavor of Oren-Nayar proposed to be part of OpenPBR and MaterialX. I've added it to the existing closure for now, but it would be easy to split it off as its own closure (the amount of shared logic is not that much, and we can easily use the same parameter struct). This is mainly to get the ball rolling on implementation.

Tests

I've tweaked the existing furnace test to test both variants. As expected, the new closure passes the furnace test prefectly while the old one had some issues.

Checklist:

fpsunflower commented 4 months ago

@jstone-lucasfilm and @portsmouth may be interested in this.

jstone-lucasfilm commented 4 months ago

This is great timing, thanks @fpsunflower, and I'm posting a link to my initial implementation of EON Diffuse in GLSL:

https://github.com/AcademySoftwareFoundation/MaterialX/pull/1822

Let's compare notes on these two implementations in upcoming days, and I'm happy to borrow elements from your version where we can improve alignment.

lgritz commented 4 months ago

Needs clang-format but otherwise LGTM. I'm leaving it for others to check the math.

Oh, maybe update the docs where the material closures are described: src/doc/stdlib.md

jstone-lucasfilm commented 4 months ago

@fpsunflower Two initial notes on differences between the current GLSL and OSL, where we should discuss the best approach on which to harmonize:

fpsunflower commented 4 months ago

energy_compensation sounds better. I've updated.

For the A/B terms, I'm following Jamie's recommendation to switch to the Fujii terms as well.

It is worth considering if it wouldn't just be simpler to have a second BSDF since what we are doing here is really more than just compensating the previous one.

portsmouth commented 4 months ago

I agree with "energy compensation" over "energy conservation".

As strictly speaking, the latter just means that energy is not gained. While the model actually ensures that energy is not lost in the white albedo case (i.e. energy preservation) -- via the compensation.

portsmouth commented 4 months ago

Note that we're working on improved importance sampling, using an LTC lobe. We get some decent reduction in variance from that, though we're still testing (and talking with @shill-lucasfilm about possible further improvements):

image

A self-contained GLSL implementation of the sampling function is below, if you want to try it. Of course, since this is WIP, for this PR you probably will want to just use regular cosine-weighted sampling and leave this improvement for later. Just thought it worth mentioning.

vec4 sample_cosine_lobe(inout uint rndSeed)
{
    float rh = sqrt(rand(rndSeed));
    float phi_h = 2.0 * PI * rand(rndSeed);
    float x = rh * cos(phi_h); float y = rh * sin(phi_h);
    float z = sqrt(max(0.0, 1.0 - x*x - y*y));
    return vec4(x, y, z, abs(z) / PI);
}

vec4 sample_EON(in vec3 wo, float roughness, inout uint rndSeed)
{
    vec4 wh = sample_cosine_lobe(rndSeed);         // wh sample
    float sinO = sqrt(1.0 - wo.z*wo.z);            // for LTC elements
    float a = roughness*pow(sinO, 50.0)*0.5 + 1.0; // LTC M element a
    float b = roughness*pow(sinO, 4.0)*sqrt(wo.z); // LTC M element b
    vec3 wi = vec3(a*wh.x + b*wh.z, a*wh.y, wh.z); // wi = M wh
    float l = 1.0/length(wi);                      // 1/|M wh| = |M^-1 wi|
    wi *= l;                                       // normalize wi
    float inv_det = 1.0/(a*a);                     // |M^-1|
    float J = inv_det / (l*l*l);                   // |M^-1| / |M^-1 wi|^3
    float pdf_i = wh.w * J;                        // PDF D(wi), Eqn.1 LTC paper
    float phi_o = atan(wo.y, wo.x);                // find azimuth of wo plane
    float C = cos(phi_o); float S = sin(phi_o);    // 2d rot matrix
    mat2 R = mat2(C, S, -S, C);                    // 2d rot matrix
    wi = vec3(R*wi.xz, wi.z);                      // rotate wi
    return vec4(wi, pdf_i);
}
jstone-lucasfilm commented 4 months ago

@fpsunflower @portsmouth I'm open to the idea that EON is a completely distinct diffuse model, especially if it uses an independent set of A and B terms, though I'd like to work through the additional infrastructure that this would require of renderers.

If we maintain independent A and B terms for EON and QON, I'm imagining that this would require independent directional albedo lookups for the two models as well (e.g. in the context of pre-integrated irradiance maps for real-time shading).

portsmouth commented 4 months ago

@fpsunflower @portsmouth I'm open to the idea that EON is a completely distinct diffuse model, especially if it uses an independent set of A and B terms, though I'd like to work through the additional infrastructure that this would require of renderers.

👍

If we maintain independent A and B terms for EON and QON, I'm imagining that this would require independent directional albedo lookups for the two models as well (e.g. in the context of pre-integrated irradiance maps for real-time shading).

Note that the directional albedo for EON is analytical, as given below in GLSL, so should be pretty easy to add:


const float pi = 3.14159265f;
const float constant1_FON = 0.5f - 2.0f / (3.0f * pi);
const float constant2_FON = 2.0f / 3.0f - 28.0f / (15.0f * pi);

  // FON directional albedo (approx.)
  float E_FON_approx(float mu, float roughness)
  {
      float sigma = roughness; // FON sigma prime
      float mucomp = 1.0f - mu;
      float mucomp2 = mucomp * mucomp;
      const mat2 Gcoeffs = mat2(0.0571085289f, -0.332181442f,
                                0.491881867f,   0.0714429953f);
      float GoverPi = dot(Gcoeffs * vec2(mucomp, mucomp2), vec2(1.0f, mucomp2));
      return (1.0f + sigma * GoverPi) / (1.0f + constant1_FON * sigma);
  }

  // EON directional albedo (approx)
  vec3 E_EON(vec3 rho, float roughness, vec3 wi_local)
  {
      float sigma = roughness;                           // FON sigma prime
      float mu_i = wi_local.z;                           // input angle cos
      float AF = 1.0f / (1.0f + constant1_FON * sigma);  // FON A coeff.
      float EF = E_FON_approx(mu_i, sigma);              // FON wi albedo (approx)
      float avgEF = AF * (1.0f + constant2_FON * sigma); // average albedo
      vec3 rho_ms = (rho * rho) * avgEF / (vec3(1.0f) - rho * (1.0f - avgEF));
      return rho * EF + rho_ms * (1.0f - EF);
  }
jstone-lucasfilm commented 4 months ago

@portsmouth This sounds reasonable to me, and I'll give this a try in MaterialX.

From a naming perspective, I'd lean towards continuing to use the name oren_nayar_diffuse_bsdf for this shading model, following in the tradition of continuing to use the name generalized_schlick_bsdf when Naty's F82 improvements were included.

I'd be interested in thoughts from others on the thread, and I'm open to counter-proposals for naming conventions if you have a compelling suggestion.

portsmouth commented 4 months ago

If it's going to be a single BSDF with non-energy-preserving (legacy) and energy-preserving modes, I think it's fine to keep the name as oren_nayar_diffuse_bsdf (as all we're doing is tweaking/improving the original BSDF really, and the name is a famous one so no need to reinvent it).

If you want to make it a separate BSDF, then some other name is needed, but I think that's more confusing.

fpsunflower commented 4 months ago

Maybe we should simply call it portsmouth_diffuse_brdf? or portsmouth_fujii_diffuse_brdf?

I would argue that tweaking fresnel is a different case than modifying the shape of the lobe. Also, IIRC the F82 change was mostly "backwards compatible" in that a default of white for F82 didn't change the look that much.

Here, there are basically 3 improvements we are taking:

In my implementation there weren't that many lines of code that could be shared in practice (even less so if we derive a custom sampling routine for this bsdf). So I think it probably would be cleanest to just make it a new brdf.

lgritz commented 4 months ago

Aside: I like how everybody seems to be allowed to have just one BSDF named after them per lifetime. Choose wisely!

jstone-lucasfilm commented 4 months ago

@fpsunflower @portsmouth I've updated the MaterialX GLSL implementation to use Fujii's constants and fixes when energy_compensation is enabled. Take a look when you have a chance, and we can continue comparing notes on these two initial implementations of EON.

etheory commented 4 months ago

Hi all, I tried to look for the Slack channel where this discussion started (only just became aware of it in the OSL TSC) but here was a spreadsheet I made a while back that does the same thing, not sure if it's the same, but it's also closed-form and very simple: https://www.desmos.com/calculator/extaj6cnzr (and reciprocal).

fpsunflower commented 4 months ago

@etheory it looks like your math aligns with what @portsmouth wrote up. Very cool :)

The MaterialX change was just merged, I will make a few final tweaks to handle rho correctly (had a TODO about this in the code) and merge this one as well.

etheory commented 2 weeks ago

Note that we're working on improved importance sampling, using an LTC lobe. We get some decent reduction in variance from that, though we're still testing (and talking with @shill-lucasfilm about possible further improvements):

image

A self-contained GLSL implementation of the sampling function is below, if you want to try it. Of course, since this is WIP, for this PR you probably will want to just use regular cosine-weighted sampling and leave this improvement for later. Just thought it worth mentioning.

vec4 sample_cosine_lobe(inout uint rndSeed)
{
    float rh = sqrt(rand(rndSeed));
    float phi_h = 2.0 * PI * rand(rndSeed);
    float x = rh * cos(phi_h); float y = rh * sin(phi_h);
    float z = sqrt(max(0.0, 1.0 - x*x - y*y));
    return vec4(x, y, z, abs(z) / PI);
}

vec4 sample_EON(in vec3 wo, float roughness, inout uint rndSeed)
{
    vec4 wh = sample_cosine_lobe(rndSeed);         // wh sample
    float sinO = sqrt(1.0 - wo.z*wo.z);            // for LTC elements
    float a = roughness*pow(sinO, 50.0)*0.5 + 1.0; // LTC M element a
    float b = roughness*pow(sinO, 4.0)*sqrt(wo.z); // LTC M element b
    vec3 wi = vec3(a*wh.x + b*wh.z, a*wh.y, wh.z); // wi = M wh
    float l = 1.0/length(wi);                      // 1/|M wh| = |M^-1 wi|
    wi *= l;                                       // normalize wi
    float inv_det = 1.0/(a*a);                     // |M^-1|
    float J = inv_det / (l*l*l);                   // |M^-1| / |M^-1 wi|^3
    float pdf_i = wh.w * J;                        // PDF D(wi), Eqn.1 LTC paper
    float phi_o = atan(wo.y, wo.x);                // find azimuth of wo plane
    float C = cos(phi_o); float S = sin(phi_o);    // 2d rot matrix
    mat2 R = mat2(C, S, -S, C);                    // 2d rot matrix
    wi = vec3(R*wi.xz, wi.z);                      // rotate wi
    return vec4(wi, pdf_i);
}

Am I right in thinking that there is an error in this example and that the line: wi = vec3(R*wi.xz, wi.z); // rotate wi should actually be: wi = vec3(R*wi.xy, wi.z); // rotate wi ? Desmos certainly thinks so: https://www.desmos.com/3d/smnslbwb5j Amongst seemingly other issues with the orientation.

etheory commented 2 weeks ago

OK, the second issue is that rotation matrix should be mat2(C, -S, S, C) instead, then it behaves as I'd expect.

I put the corrected code below:

vec4 sample_EON(in vec3 wo, float roughness, inout uint rndSeed)
{
    vec4 wh = sample_cosine_lobe(rndSeed);         // wh sample
    float sinO = sqrt(1.0 - wo.z*wo.z);            // for LTC elements
    float a = roughness*pow(sinO, 50.0)*0.5 + 1.0; // LTC M element a
    float b = roughness*pow(sinO, 4.0)*sqrt(wo.z); // LTC M element b
    vec3 wi = vec3(a*wh.x + b*wh.z, a*wh.y, wh.z); // wi = M wh
    float l = 1.0/length(wi);                      // 1/|M wh| = |M^-1 wi|
    wi *= l;                                       // normalize wi
    float inv_det = 1.0/(a*a);                     // |M^-1|
    float J = inv_det / (l*l*l);                   // |M^-1| / |M^-1 wi|^3
    float pdf_i = wh.w * J;                        // PDF D(wi), Eqn.1 LTC paper
    float phi_o = atan(wo.y, wo.x);                // find azimuth of wo plane
    float C = cos(phi_o); float S = sin(phi_o);    // 2d rot matrix
    mat2 R = mat2(C, -S, S, C);                    // 2d rot matrix
    wi = vec3(R*wi.xy, wi.z);                      // rotate wi
    return vec4(wi, pdf_i);
}

Here is a corrected Desmos spreadsheet that validates it: https://www.desmos.com/3d/grcc05isat

Now you can see that the model is now retroreflective, whereas previously the outgoing direction was incorrect.

etheory commented 2 weeks ago

I've additionally added the sampling PDF (blue) vs the BRDF (purple) in a new spreadsheet to make clear where the fit has issues:

https://www.desmos.com/3d/cv0qraxtzj

Super interesting thanks @portsmouth!

etheory commented 2 weeks ago

Here are some suggested changes to reduce the computational complexity:

vec4 sample_EON(in vec3 wo, float roughness, inout uint rndSeed)
{
    vec4 wh = sample_cosine_lobe(rndSeed);         // wh sample
    float cosO = wo.z;                                             // for LTC elements
    float one_minus_cosO2 = 1.f - sqr(cosO);
    float a = roughness*pow(one_minus_cosO2, 25.0)*0.5 + 1.0; // LTC M element a
    float b = roughness*sqr(one_minus_cosO2)*sqrt(wo.z); // LTC M element b
    vec3 wi = vec3(a*wh.x + b*wh.z, a*wh.y, wh.z); // wi = M wh
    float l = 1.0/length(wi);                                      // 1/|M wh| = |M^-1 wi|
    wi *= l;                                                               // normalize wi
    float inv_det = 1.0/(a*a);                                   // |M^-1|
    float J = inv_det / (l*l*l);                                    // |M^-1| / |M^-1 wi|^3
    float pdf_i = wh.w * J;                                       // PDF D(wi), Eqn.1 LTC paper
    float C, S;
    if (!(wo.x == 0.f && wo.y == 0.f))
    {
      float normxy = 1.f / sqrt(sqr(wo.x) + sqr(wo.y));
      C = normxy*wo.x; S = normxy*wo.y;              // 2d rot matrix
    }
    else
    {
        C = 1.f; S = 0.f;
    }
    mat2 R = mat2(C, -S, S, C);                    // 2d rot matrix
    wi = vec3(R*wi.xy, wi.z);                      // rotate wi
    return vec4(wi, pdf_i);
}

The relations I used are cos(Phi) == normxy * wo.x and sin(Phi) == normxy * wo.y Then pow(sinO, 50.0) is identical to pow(1.f - sqr(cosO), 25.0) hence pow(sinO, 4.0) is identical to sqr(1.f - sqr(cosO))

The result is identical but now you remove one square root, and remove a sin and cos calculation and a pow calculation. So it runs much much faster.

Next I want to fit better a and b values to the data that minimize the variance. Your fits are a good start but I think it should be possible to do better.

Updated desmos spreadsheet including the above changes: https://www.desmos.com/3d/4jprpxqznf

Also for anyone interested for eval, here is the way to compute the pdf from wo and wi, which completes the requirement for a full shader:

    void compute_a_and_b(const float cosWo, float& a, float& b) const
    {
      // const float sinWo = cos2sin(cosWo); // which is sinWo = sqrt(1.f - sqr(cosWo));
      const float one_minus_cosWo2 = max(0.f, 1.f - sqr(cosWo));
      // NOTE: pow(one_minus_cosWo2, 25.f) == pow(sinWo, 50.f)
      //  and: sqr(one_minus_cosWo2) == pow(one_minus_cosWo2, 2.f) == pow(sinWo, 4.f)
      a = roughness * 0.5f * pow(one_minus_cosWo2, 25.f) + 1.f;
      b = roughness * sqr(one_minus_cosWo2) * sqrt(cosWo);
    }

    // reverse engineer what the original wh was to get this wi
    // then proceed as we would have in the forwards direction
    // this is based on the Linearly Transformed Cosines paper: https://eheitzresearch.wordpress.com/415-2/
    // there it is used with a best fit in the a and b stretch directions to fit the Oren Nayar shape
    float pdfInternal(const float a, const float b, const Vec3f& wi) const
    {
      // transform wi into tangent space (depends on your renderer....)
      // inverse distort wi vector by LTC stretch values a, b
      const float inv_a = 1.f / a;
      // compute what the original wh would have been given wo (wrapped up in a and b) and wi
      const Vec3f wh = Vec3f(inv_a * (wi.x - b * wi.z), inv_a * wi.y, wi.z);
      // compute LTC jacobian
      const float l = length(wh);
      const float J = sqr(inv_a) / (l * l * l);
      // compute final pdf (wi.z = cosTheta)
      return INV_PI * wi.z * J;
    }

Which I derived by running the technique for sampling and pdf generation from @portsmouth backwards and solving the equations in reverse.

shill-lucasfilm commented 2 weeks ago

@etheory Indeed, I'd pointed out the .xz bug privately, and other improvements you'd mentioned (replacing trig for rotation, which also came up in the context of sheen; fusing the pow and sqr).

However, we've since come up with a different LTC fit and an overall improved importance sampling strategy. It's currently under review for publication as part of an "EON" writeup. Stay tuned!