Closed fpsunflower closed 4 months ago
@jstone-lucasfilm and @portsmouth may be interested in this.
This is great timing, thanks @fpsunflower, and I'm posting a link to my initial implementation of EON Diffuse in GLSL:
https://github.com/AcademySoftwareFoundation/MaterialX/pull/1822
Let's compare notes on these two implementations in upcoming days, and I'm happy to borrow elements from your version where we can improve alignment.
Needs clang-format but otherwise LGTM. I'm leaving it for others to check the math.
Oh, maybe update the docs where the material closures are described: src/doc/stdlib.md
@fpsunflower Two initial notes on differences between the current GLSL and OSL, where we should discuss the best approach on which to harmonize:
energy_compensation
, based on the naming convention I've seen for GGX energy compensation flags in some production renderers, while in OSL the new option is named energy_conservation
. I don't have a strong opinion on this, and am open to preferences from the group.energy_compensation sounds better. I've updated.
For the A/B terms, I'm following Jamie's recommendation to switch to the Fujii terms as well.
It is worth considering if it wouldn't just be simpler to have a second BSDF since what we are doing here is really more than just compensating the previous one.
I agree with "energy compensation" over "energy conservation".
As strictly speaking, the latter just means that energy is not gained. While the model actually ensures that energy is not lost in the white albedo case (i.e. energy preservation) -- via the compensation.
Note that we're working on improved importance sampling, using an LTC lobe. We get some decent reduction in variance from that, though we're still testing (and talking with @shill-lucasfilm about possible further improvements):
A self-contained GLSL implementation of the sampling function is below, if you want to try it. Of course, since this is WIP, for this PR you probably will want to just use regular cosine-weighted sampling and leave this improvement for later. Just thought it worth mentioning.
vec4 sample_cosine_lobe(inout uint rndSeed)
{
float rh = sqrt(rand(rndSeed));
float phi_h = 2.0 * PI * rand(rndSeed);
float x = rh * cos(phi_h); float y = rh * sin(phi_h);
float z = sqrt(max(0.0, 1.0 - x*x - y*y));
return vec4(x, y, z, abs(z) / PI);
}
vec4 sample_EON(in vec3 wo, float roughness, inout uint rndSeed)
{
vec4 wh = sample_cosine_lobe(rndSeed); // wh sample
float sinO = sqrt(1.0 - wo.z*wo.z); // for LTC elements
float a = roughness*pow(sinO, 50.0)*0.5 + 1.0; // LTC M element a
float b = roughness*pow(sinO, 4.0)*sqrt(wo.z); // LTC M element b
vec3 wi = vec3(a*wh.x + b*wh.z, a*wh.y, wh.z); // wi = M wh
float l = 1.0/length(wi); // 1/|M wh| = |M^-1 wi|
wi *= l; // normalize wi
float inv_det = 1.0/(a*a); // |M^-1|
float J = inv_det / (l*l*l); // |M^-1| / |M^-1 wi|^3
float pdf_i = wh.w * J; // PDF D(wi), Eqn.1 LTC paper
float phi_o = atan(wo.y, wo.x); // find azimuth of wo plane
float C = cos(phi_o); float S = sin(phi_o); // 2d rot matrix
mat2 R = mat2(C, S, -S, C); // 2d rot matrix
wi = vec3(R*wi.xz, wi.z); // rotate wi
return vec4(wi, pdf_i);
}
@fpsunflower @portsmouth I'm open to the idea that EON is a completely distinct diffuse model, especially if it uses an independent set of A and B terms, though I'd like to work through the additional infrastructure that this would require of renderers.
If we maintain independent A and B terms for EON and QON, I'm imagining that this would require independent directional albedo lookups for the two models as well (e.g. in the context of pre-integrated irradiance maps for real-time shading).
@fpsunflower @portsmouth I'm open to the idea that EON is a completely distinct diffuse model, especially if it uses an independent set of A and B terms, though I'd like to work through the additional infrastructure that this would require of renderers.
👍
If we maintain independent A and B terms for EON and QON, I'm imagining that this would require independent directional albedo lookups for the two models as well (e.g. in the context of pre-integrated irradiance maps for real-time shading).
Note that the directional albedo for EON is analytical, as given below in GLSL, so should be pretty easy to add:
const float pi = 3.14159265f;
const float constant1_FON = 0.5f - 2.0f / (3.0f * pi);
const float constant2_FON = 2.0f / 3.0f - 28.0f / (15.0f * pi);
// FON directional albedo (approx.)
float E_FON_approx(float mu, float roughness)
{
float sigma = roughness; // FON sigma prime
float mucomp = 1.0f - mu;
float mucomp2 = mucomp * mucomp;
const mat2 Gcoeffs = mat2(0.0571085289f, -0.332181442f,
0.491881867f, 0.0714429953f);
float GoverPi = dot(Gcoeffs * vec2(mucomp, mucomp2), vec2(1.0f, mucomp2));
return (1.0f + sigma * GoverPi) / (1.0f + constant1_FON * sigma);
}
// EON directional albedo (approx)
vec3 E_EON(vec3 rho, float roughness, vec3 wi_local)
{
float sigma = roughness; // FON sigma prime
float mu_i = wi_local.z; // input angle cos
float AF = 1.0f / (1.0f + constant1_FON * sigma); // FON A coeff.
float EF = E_FON_approx(mu_i, sigma); // FON wi albedo (approx)
float avgEF = AF * (1.0f + constant2_FON * sigma); // average albedo
vec3 rho_ms = (rho * rho) * avgEF / (vec3(1.0f) - rho * (1.0f - avgEF));
return rho * EF + rho_ms * (1.0f - EF);
}
@portsmouth This sounds reasonable to me, and I'll give this a try in MaterialX.
From a naming perspective, I'd lean towards continuing to use the name oren_nayar_diffuse_bsdf
for this shading model, following in the tradition of continuing to use the name generalized_schlick_bsdf
when Naty's F82 improvements were included.
I'd be interested in thoughts from others on the thread, and I'm open to counter-proposals for naming conventions if you have a compelling suggestion.
If it's going to be a single BSDF with non-energy-preserving (legacy) and energy-preserving modes, I think it's fine to keep the name as oren_nayar_diffuse_bsdf
(as all we're doing is tweaking/improving the original BSDF really, and the name is a famous one so no need to reinvent it).
If you want to make it a separate BSDF, then some other name is needed, but I think that's more confusing.
Maybe we should simply call it portsmouth_diffuse_brdf
? or portsmouth_fujii_diffuse_brdf
?
I would argue that tweaking fresnel is a different case than modifying the shape of the lobe. Also, IIRC the F82 change was mostly "backwards compatible" in that a default of white for F82 didn't change the look that much.
Here, there are basically 3 improvements we are taking:
In my implementation there weren't that many lines of code that could be shared in practice (even less so if we derive a custom sampling routine for this bsdf). So I think it probably would be cleanest to just make it a new brdf.
Aside: I like how everybody seems to be allowed to have just one BSDF named after them per lifetime. Choose wisely!
@fpsunflower @portsmouth I've updated the MaterialX GLSL implementation to use Fujii's constants and fixes when energy_compensation
is enabled. Take a look when you have a chance, and we can continue comparing notes on these two initial implementations of EON.
Hi all, I tried to look for the Slack channel where this discussion started (only just became aware of it in the OSL TSC) but here was a spreadsheet I made a while back that does the same thing, not sure if it's the same, but it's also closed-form and very simple: https://www.desmos.com/calculator/extaj6cnzr (and reciprocal).
@etheory it looks like your math aligns with what @portsmouth wrote up. Very cool :)
The MaterialX change was just merged, I will make a few final tweaks to handle rho
correctly (had a TODO about this in the code) and merge this one as well.
Note that we're working on improved importance sampling, using an LTC lobe. We get some decent reduction in variance from that, though we're still testing (and talking with @shill-lucasfilm about possible further improvements):
A self-contained GLSL implementation of the sampling function is below, if you want to try it. Of course, since this is WIP, for this PR you probably will want to just use regular cosine-weighted sampling and leave this improvement for later. Just thought it worth mentioning.
vec4 sample_cosine_lobe(inout uint rndSeed) { float rh = sqrt(rand(rndSeed)); float phi_h = 2.0 * PI * rand(rndSeed); float x = rh * cos(phi_h); float y = rh * sin(phi_h); float z = sqrt(max(0.0, 1.0 - x*x - y*y)); return vec4(x, y, z, abs(z) / PI); } vec4 sample_EON(in vec3 wo, float roughness, inout uint rndSeed) { vec4 wh = sample_cosine_lobe(rndSeed); // wh sample float sinO = sqrt(1.0 - wo.z*wo.z); // for LTC elements float a = roughness*pow(sinO, 50.0)*0.5 + 1.0; // LTC M element a float b = roughness*pow(sinO, 4.0)*sqrt(wo.z); // LTC M element b vec3 wi = vec3(a*wh.x + b*wh.z, a*wh.y, wh.z); // wi = M wh float l = 1.0/length(wi); // 1/|M wh| = |M^-1 wi| wi *= l; // normalize wi float inv_det = 1.0/(a*a); // |M^-1| float J = inv_det / (l*l*l); // |M^-1| / |M^-1 wi|^3 float pdf_i = wh.w * J; // PDF D(wi), Eqn.1 LTC paper float phi_o = atan(wo.y, wo.x); // find azimuth of wo plane float C = cos(phi_o); float S = sin(phi_o); // 2d rot matrix mat2 R = mat2(C, S, -S, C); // 2d rot matrix wi = vec3(R*wi.xz, wi.z); // rotate wi return vec4(wi, pdf_i); }
Am I right in thinking that there is an error in this example and that the line:
wi = vec3(R*wi.xz, wi.z); // rotate wi
should actually be:
wi = vec3(R*wi.xy, wi.z); // rotate wi
?
Desmos certainly thinks so: https://www.desmos.com/3d/smnslbwb5j
Amongst seemingly other issues with the orientation.
OK, the second issue is that rotation matrix should be mat2(C, -S, S, C) instead, then it behaves as I'd expect.
I put the corrected code below:
vec4 sample_EON(in vec3 wo, float roughness, inout uint rndSeed)
{
vec4 wh = sample_cosine_lobe(rndSeed); // wh sample
float sinO = sqrt(1.0 - wo.z*wo.z); // for LTC elements
float a = roughness*pow(sinO, 50.0)*0.5 + 1.0; // LTC M element a
float b = roughness*pow(sinO, 4.0)*sqrt(wo.z); // LTC M element b
vec3 wi = vec3(a*wh.x + b*wh.z, a*wh.y, wh.z); // wi = M wh
float l = 1.0/length(wi); // 1/|M wh| = |M^-1 wi|
wi *= l; // normalize wi
float inv_det = 1.0/(a*a); // |M^-1|
float J = inv_det / (l*l*l); // |M^-1| / |M^-1 wi|^3
float pdf_i = wh.w * J; // PDF D(wi), Eqn.1 LTC paper
float phi_o = atan(wo.y, wo.x); // find azimuth of wo plane
float C = cos(phi_o); float S = sin(phi_o); // 2d rot matrix
mat2 R = mat2(C, -S, S, C); // 2d rot matrix
wi = vec3(R*wi.xy, wi.z); // rotate wi
return vec4(wi, pdf_i);
}
Here is a corrected Desmos spreadsheet that validates it: https://www.desmos.com/3d/grcc05isat
Now you can see that the model is now retroreflective, whereas previously the outgoing direction was incorrect.
I've additionally added the sampling PDF (blue) vs the BRDF (purple) in a new spreadsheet to make clear where the fit has issues:
https://www.desmos.com/3d/cv0qraxtzj
Super interesting thanks @portsmouth!
Here are some suggested changes to reduce the computational complexity:
vec4 sample_EON(in vec3 wo, float roughness, inout uint rndSeed)
{
vec4 wh = sample_cosine_lobe(rndSeed); // wh sample
float cosO = wo.z; // for LTC elements
float one_minus_cosO2 = 1.f - sqr(cosO);
float a = roughness*pow(one_minus_cosO2, 25.0)*0.5 + 1.0; // LTC M element a
float b = roughness*sqr(one_minus_cosO2)*sqrt(wo.z); // LTC M element b
vec3 wi = vec3(a*wh.x + b*wh.z, a*wh.y, wh.z); // wi = M wh
float l = 1.0/length(wi); // 1/|M wh| = |M^-1 wi|
wi *= l; // normalize wi
float inv_det = 1.0/(a*a); // |M^-1|
float J = inv_det / (l*l*l); // |M^-1| / |M^-1 wi|^3
float pdf_i = wh.w * J; // PDF D(wi), Eqn.1 LTC paper
float C, S;
if (!(wo.x == 0.f && wo.y == 0.f))
{
float normxy = 1.f / sqrt(sqr(wo.x) + sqr(wo.y));
C = normxy*wo.x; S = normxy*wo.y; // 2d rot matrix
}
else
{
C = 1.f; S = 0.f;
}
mat2 R = mat2(C, -S, S, C); // 2d rot matrix
wi = vec3(R*wi.xy, wi.z); // rotate wi
return vec4(wi, pdf_i);
}
The relations I used are cos(Phi) == normxy * wo.x
and sin(Phi) == normxy * wo.y
Then pow(sinO, 50.0)
is identical to pow(1.f - sqr(cosO), 25.0)
hence
pow(sinO, 4.0)
is identical to sqr(1.f - sqr(cosO))
The result is identical but now you remove one square root, and remove a sin and cos calculation and a pow calculation. So it runs much much faster.
Next I want to fit better a and b values to the data that minimize the variance. Your fits are a good start but I think it should be possible to do better.
Updated desmos spreadsheet including the above changes: https://www.desmos.com/3d/4jprpxqznf
Also for anyone interested for eval, here is the way to compute the pdf from wo and wi, which completes the requirement for a full shader:
void compute_a_and_b(const float cosWo, float& a, float& b) const
{
// const float sinWo = cos2sin(cosWo); // which is sinWo = sqrt(1.f - sqr(cosWo));
const float one_minus_cosWo2 = max(0.f, 1.f - sqr(cosWo));
// NOTE: pow(one_minus_cosWo2, 25.f) == pow(sinWo, 50.f)
// and: sqr(one_minus_cosWo2) == pow(one_minus_cosWo2, 2.f) == pow(sinWo, 4.f)
a = roughness * 0.5f * pow(one_minus_cosWo2, 25.f) + 1.f;
b = roughness * sqr(one_minus_cosWo2) * sqrt(cosWo);
}
// reverse engineer what the original wh was to get this wi
// then proceed as we would have in the forwards direction
// this is based on the Linearly Transformed Cosines paper: https://eheitzresearch.wordpress.com/415-2/
// there it is used with a best fit in the a and b stretch directions to fit the Oren Nayar shape
float pdfInternal(const float a, const float b, const Vec3f& wi) const
{
// transform wi into tangent space (depends on your renderer....)
// inverse distort wi vector by LTC stretch values a, b
const float inv_a = 1.f / a;
// compute what the original wh would have been given wo (wrapped up in a and b) and wi
const Vec3f wh = Vec3f(inv_a * (wi.x - b * wi.z), inv_a * wi.y, wi.z);
// compute LTC jacobian
const float l = length(wh);
const float J = sqr(inv_a) / (l * l * l);
// compute final pdf (wi.z = cosTheta)
return INV_PI * wi.z * J;
}
Which I derived by running the technique for sampling and pdf generation from @portsmouth backwards and solving the equations in reverse.
@etheory Indeed, I'd pointed out the .xz
bug privately, and other improvements you'd mentioned (replacing trig for rotation, which also came up in the context of sheen; fusing the pow
and sqr
).
However, we've since come up with a different LTC fit and an overall improved importance sampling strategy. It's currently under review for publication as part of an "EON" writeup. Stay tuned!
Description
This implements the "EON" energy conserving flavor of Oren-Nayar proposed to be part of OpenPBR and MaterialX. I've added it to the existing closure for now, but it would be easy to split it off as its own closure (the amount of shared logic is not that much, and we can easily use the same parameter struct). This is mainly to get the ball rolling on implementation.
Tests
I've tweaked the existing furnace test to test both variants. As expected, the new closure passes the furnace test prefectly while the old one had some issues.
Checklist: