godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
91.04k stars 21.18k forks source link

Simplex noise in the Spatial shader has an incorrect result #77325

Closed David-DiGioia closed 1 year ago

David-DiGioia commented 1 year ago

Godot version

4.0.3 stable

System information

Windows 10, gtx 1080, driver version 531.79 Vulkan backend.

Issue description

I am implementing 3d simplex noise in a spatial shader using this implementation https://github.com/ashima/webgl-noise/blob/master/src/noise3Dgrad.glsl

In shadertoy the implementation produces the expected result: image

But using almost identical code in a Godot shader (only modified to give functions unique names since Godot doesn't support shader function overloading) produces an incorrect result with splotches of the noise falling well outside the [-1, 1] range it is suppose to be in, some even falling outside [-10, 10]: image

Stepping through the shader code in shadertoy and Godot, the values begin to diverge at the line

  vec4 p = permute( permute( permute( 
             i.z + vec4(0.0, i1.z, i2.z, 1.0 ))
           + i.y + vec4(0.0, i1.y, i2.y, 1.0 )) 
           + i.x + vec4(0.0, i1.x, i2.x, 1.0 ));

Steps to reproduce

Create a new 3D scene, add a MeshInstance3D node as a child of the root. Give mesh instance node a new PlaneMesh (or any other mesh) and under Surface Material Override, add a new material by selecting New ShaderMaterial. Open the text editor of the shader material and paste the following shader:

shader_type spatial;
render_mode unshaded;

vec3 mod289(vec3 x) {
  return x - floor(x * (1.0 / 289.0)) * 289.0;
}

vec4 mod289vec4(vec4 x) {
  return x - floor(x * (1.0 / 289.0)) * 289.0;
}

vec4 permute(vec4 x) {
     return mod289vec4(((x*34.0)+10.0)*x);
}

vec4 taylorInvSqrt(vec4 r)
{
  return 1.79284291400159 - 0.85373472095314 * r;
}

float snoise(vec3 v, out vec3 gradient)
{
  const vec2  C = vec2(1.0/6.0, 1.0/3.0) ;
  const vec4  D = vec4(0.0, 0.5, 1.0, 2.0);

// First corner
  vec3 i  = floor(v + dot(v, C.yyy) );
  vec3 x0 =   v - i + dot(i, C.xxx) ;

// Other corners
  vec3 g = step(x0.yzx, x0.xyz);
  vec3 l = 1.0 - g;
  vec3 i1 = min( g.xyz, l.zxy );
  vec3 i2 = max( g.xyz, l.zxy );

  //   x0 = x0 - 0.0 + 0.0 * C.xxx;
  //   x1 = x0 - i1  + 1.0 * C.xxx;
  //   x2 = x0 - i2  + 2.0 * C.xxx;
  //   x3 = x0 - 1.0 + 3.0 * C.xxx;
  vec3 x1 = x0 - i1 + C.xxx;
  vec3 x2 = x0 - i2 + C.yyy; // 2.0*C.x = 1/3 = C.y
  vec3 x3 = x0 - D.yyy;      // -1.0+3.0*C.x = -0.5 = -D.y

// Permutations
  i = mod289(i); 
  vec4 p = permute( permute( permute( 
             i.z + vec4(0.0, i1.z, i2.z, 1.0 ))
           + i.y + vec4(0.0, i1.y, i2.y, 1.0 )) 
           + i.x + vec4(0.0, i1.x, i2.x, 1.0 ));

// Gradients: 7x7 points over a square, mapped onto an octahedron.
// The ring size 17*17 = 289 is close to a multiple of 49 (49*6 = 294)
  float n_ = 0.142857142857; // 1.0/7.0
  vec3  ns = n_ * D.wyz - D.xzx;

  vec4 j = p - 49.0 * floor(p * ns.z * ns.z);  //  mod(p,7*7)

  vec4 x_ = floor(j * ns.z);
  vec4 y_ = floor(j - 7.0 * x_ );    // mod(j,N)

  vec4 x = x_ *ns.x + ns.yyyy;
  vec4 y = y_ *ns.x + ns.yyyy;
  vec4 h = 1.0 - abs(x) - abs(y);

  vec4 b0 = vec4( x.xy, y.xy );
  vec4 b1 = vec4( x.zw, y.zw );

  //vec4 s0 = vec4(lessThan(b0,0.0))*2.0 - 1.0;
  //vec4 s1 = vec4(lessThan(b1,0.0))*2.0 - 1.0;
  vec4 s0 = floor(b0)*2.0 + 1.0;
  vec4 s1 = floor(b1)*2.0 + 1.0;
  vec4 sh = -step(h, vec4(0.0));

  vec4 a0 = b0.xzyw + s0.xzyw*sh.xxyy ;
  vec4 a1 = b1.xzyw + s1.xzyw*sh.zzww ;

  vec3 p0 = vec3(a0.xy,h.x);
  vec3 p1 = vec3(a0.zw,h.y);
  vec3 p2 = vec3(a1.xy,h.z);
  vec3 p3 = vec3(a1.zw,h.w);

//Normalise gradients
  vec4 norm = taylorInvSqrt(vec4(dot(p0,p0), dot(p1,p1), dot(p2, p2), dot(p3,p3)));
  p0 *= norm.x;
  p1 *= norm.y;
  p2 *= norm.z;
  p3 *= norm.w;

// Mix final noise value
  vec4 m = max(0.5 - vec4(dot(x0,x0), dot(x1,x1), dot(x2,x2), dot(x3,x3)), 0.0);
  vec4 m2 = m * m;
  vec4 m4 = m2 * m2;
  vec4 pdotx = vec4(dot(p0,x0), dot(p1,x1), dot(p2,x2), dot(p3,x3));

// Determine noise gradient
  vec4 temp = m2 * m * pdotx;
  gradient = -8.0 * (temp.x * x0 + temp.y * x1 + temp.z * x2 + temp.w * x3);
  gradient += m4.x * p0 + m4.y * p1 + m4.z * p2 + m4.w * p3;
  gradient *= 105.0;

  return 105.0 * dot(m4, pdotx);
}

float snoise01(vec3 position) {
    vec3 grad;
    float n = snoise(position, grad);
    return n * 0.5 + 0.5;
}

void fragment() {
    float n = snoise01(vec3(UV * 10.0, 0));
    ALBEDO = vec3(n);
}

To see the shadertoy comparison, go to https://www.shadertoy.com/new and paste the following code:

vec3 mod289(vec3 x) {
  return x - floor(x * (1.0 / 289.0)) * 289.0;
}

vec4 mod289(vec4 x) {
  return x - floor(x * (1.0 / 289.0)) * 289.0;
}

vec4 permute(vec4 x) {
     return mod289(((x*34.0)+10.0)*x);
}

vec4 taylorInvSqrt(vec4 r)
{
  return 1.79284291400159 - 0.85373472095314 * r;
}

float snoise(vec3 v, out vec3 gradient)
{
  const vec2  C = vec2(1.0/6.0, 1.0/3.0) ;
  const vec4  D = vec4(0.0, 0.5, 1.0, 2.0);

// First corner
  vec3 i  = floor(v + dot(v, C.yyy) );
  vec3 x0 =   v - i + dot(i, C.xxx) ;

// Other corners
  vec3 g = step(x0.yzx, x0.xyz);
  vec3 l = 1.0 - g;
  vec3 i1 = min( g.xyz, l.zxy );
  vec3 i2 = max( g.xyz, l.zxy );

  //   x0 = x0 - 0.0 + 0.0 * C.xxx;
  //   x1 = x0 - i1  + 1.0 * C.xxx;
  //   x2 = x0 - i2  + 2.0 * C.xxx;
  //   x3 = x0 - 1.0 + 3.0 * C.xxx;
  vec3 x1 = x0 - i1 + C.xxx;
  vec3 x2 = x0 - i2 + C.yyy; // 2.0*C.x = 1/3 = C.y
  vec3 x3 = x0 - D.yyy;      // -1.0+3.0*C.x = -0.5 = -D.y

// Permutations
  i = mod289(i); 
  vec4 p = permute( permute( permute( 
             i.z + vec4(0.0, i1.z, i2.z, 1.0 ))
           + i.y + vec4(0.0, i1.y, i2.y, 1.0 )) 
           + i.x + vec4(0.0, i1.x, i2.x, 1.0 ));

// Gradients: 7x7 points over a square, mapped onto an octahedron.
// The ring size 17*17 = 289 is close to a multiple of 49 (49*6 = 294)
  float n_ = 0.142857142857; // 1.0/7.0
  vec3  ns = n_ * D.wyz - D.xzx;

  vec4 j = p - 49.0 * floor(p * ns.z * ns.z);  //  mod(p,7*7)

  vec4 x_ = floor(j * ns.z);
  vec4 y_ = floor(j - 7.0 * x_ );    // mod(j,N)

  vec4 x = x_ *ns.x + ns.yyyy;
  vec4 y = y_ *ns.x + ns.yyyy;
  vec4 h = 1.0 - abs(x) - abs(y);

  vec4 b0 = vec4( x.xy, y.xy );
  vec4 b1 = vec4( x.zw, y.zw );

  //vec4 s0 = vec4(lessThan(b0,0.0))*2.0 - 1.0;
  //vec4 s1 = vec4(lessThan(b1,0.0))*2.0 - 1.0;
  vec4 s0 = floor(b0)*2.0 + 1.0;
  vec4 s1 = floor(b1)*2.0 + 1.0;
  vec4 sh = -step(h, vec4(0.0));

  vec4 a0 = b0.xzyw + s0.xzyw*sh.xxyy ;
  vec4 a1 = b1.xzyw + s1.xzyw*sh.zzww ;

  vec3 p0 = vec3(a0.xy,h.x);
  vec3 p1 = vec3(a0.zw,h.y);
  vec3 p2 = vec3(a1.xy,h.z);
  vec3 p3 = vec3(a1.zw,h.w);

//Normalise gradients
  vec4 norm = taylorInvSqrt(vec4(dot(p0,p0), dot(p1,p1), dot(p2, p2), dot(p3,p3)));
  p0 *= norm.x;
  p1 *= norm.y;
  p2 *= norm.z;
  p3 *= norm.w;

// Mix final noise value
  vec4 m = max(0.5 - vec4(dot(x0,x0), dot(x1,x1), dot(x2,x2), dot(x3,x3)), 0.0);
  vec4 m2 = m * m;
  vec4 m4 = m2 * m2;
  vec4 pdotx = vec4(dot(p0,x0), dot(p1,x1), dot(p2,x2), dot(p3,x3));

// Determine noise gradient
  vec4 temp = m2 * m * pdotx;
  gradient = -8.0 * (temp.x * x0 + temp.y * x1 + temp.z * x2 + temp.w * x3);
  gradient += m4.x * p0 + m4.y * p1 + m4.z * p2 + m4.w * p3;
  gradient *= 105.0;

  return 105.0 * dot(m4, pdotx);
}

float snoise01(vec3 position) {
    vec3 grad;
    float n = snoise(position, grad);
    return n * 0.5 + 0.5;
}

void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
    // Normalized pixel coordinates (from 0 to 1)
    vec2 uv = fragCoord/iResolution.xy;

    float n = snoise01(vec3(uv * 10.0, 0));

    fragColor = vec4(vec3(n), 1.0);
}

Minimal reproduction project

shader_bug.zip

Calinou commented 1 year ago

Shadertoy code runs in WebGL, while Godot shader code runs in native GLSL (on Vulkan or OpenGL). This means some differences are expected, especially with regards to floating-point precision.

David-DiGioia commented 1 year ago

I considered that but differences in floating point precision should not produce such stark differences in output. The splotches fall well outside the [-1, 1] range and even getting outside the [-10, 10] range, an order of magnitude difference from what's expected.

But there's no reason to speculate, I ran the same shader code in a custom Vulkan app and the result matches with the shadertoy result. Though I am using a different GPU right now (RX 6800) so when I get home I'll test it with my GTX 1080 to rule out it being an NVIDIA driver bug (possibly related to #67150). image

David-DiGioia commented 1 year ago

I ran the shader in the custom Vulkan app on my GTX 1080, and still get the expected result matching shadertoy. This suggest the problem is on Godot's side. image

celyk commented 1 year ago

I found that substituting n_ with 1.0/7.0 allowed for the correct result. Looking at the generated GLSL via "Inspect Native Shader Code" shows each float literal is being rounded prematurely.

generated GLSL ``` vec4 m_taylorInvSqrt(vec4 m_r) { return (1.79284-(0.853735*m_r)); } float m_snoise(vec3 m_v, out vec3 m_gradient) { const vec2 m_C=vec2((1.0/6.0), (1.0/3.0)); const vec4 m_D=vec4(0.0,0.5,1.0,2.0); vec3 m_i=floor((m_v+dot(m_v, m_C.yyy))); vec3 m_x0=((m_v-m_i)+dot(m_i, m_C.xxx)); vec3 m_g=step(m_x0.yzx, m_x0.xyz); vec3 m_l=(1.0-m_g); vec3 m_i1=min(m_g.xyz, m_l.zxy); vec3 m_i2=max(m_g.xyz, m_l.zxy); vec3 m_x1=((m_x0-m_i1)+m_C.xxx); vec3 m_x2=((m_x0-m_i2)+m_C.yyy); vec3 m_x3=(m_x0-m_D.yyy); m_i=m_mod289(m_i); vec4 m_p=m_permute(((m_permute(((m_permute((m_i.z+vec4(0.0, m_i1.z, m_i2.z, 1.0)))+m_i.y)+vec4(0.0, m_i1.y, m_i2.y, 1.0)))+m_i.x)+vec4(0.0, m_i1.x, m_i2.x, 1.0))); float m_n_=0.142857; vec3 m_ns=((m_n_*m_D.wyz)-m_D.xzx); vec4 m_j=(m_p-(49.0*floor(((m_p*m_ns.z)*m_ns.z)))); vec4 m_x_=floor((m_j*m_ns.z)); vec4 m_y_=floor((m_j-(7.0*m_x_))); vec4 m_x=((m_x_*m_ns.x)+m_ns.yyyy); vec4 m_y=((m_y_*m_ns.x)+m_ns.yyyy); vec4 m_h=((1.0-abs(m_x))-abs(m_y)); vec4 m_b0=vec4(m_x.xy, m_y.xy); vec4 m_b1=vec4(m_x.zw, m_y.zw); vec4 m_s0=((floor(m_b0)*2.0)+1.0); vec4 m_s1=((floor(m_b1)*2.0)+1.0); vec4 m_sh=-step(m_h, vec4(0.0,0.0,0.0,0.0)); vec4 m_a0=(m_b0.xzyw+(m_s0.xzyw*m_sh.xxyy)); vec4 m_a1=(m_b1.xzyw+(m_s1.xzyw*m_sh.zzww)); vec3 m_p0=vec3(m_a0.xy, m_h.x); vec3 m_p1=vec3(m_a0.zw, m_h.y); vec3 m_p2=vec3(m_a1.xy, m_h.z); vec3 m_p3=vec3(m_a1.zw, m_h.w); vec4 m_norm=m_taylorInvSqrt(vec4(dot(m_p0, m_p0), dot(m_p1, m_p1), dot(m_p2, m_p2), dot(m_p3, m_p3))); m_p0*=m_norm.x; m_p1*=m_norm.y; m_p2*=m_norm.z; m_p3*=m_norm.w; vec4 m_m=max((0.5-vec4(dot(m_x0, m_x0), dot(m_x1, m_x1), dot(m_x2, m_x2), dot(m_x3, m_x3))), 0.0); vec4 m_m2=(m_m*m_m); vec4 m_m4=(m_m2*m_m2); vec4 m_pdotx=vec4(dot(m_p0, m_x0), dot(m_p1, m_x1), dot(m_p2, m_x2), dot(m_p3, m_x3)); vec4 m_temp=((m_m2*m_m)*m_pdotx); m_gradient=(-8.0*((((m_temp.x*m_x0)+(m_temp.y*m_x1))+(m_temp.z*m_x2))+(m_temp.w*m_x3))); m_gradient+=((((m_m4.x*m_p0)+(m_m4.y*m_p1))+(m_m4.z*m_p2))+(m_m4.w*m_p3)); m_gradient*=105.0; return (105.0*dot(m_m4, m_pdotx)); } ```
AThousandShips commented 1 year ago

Yes float constants in the shader language are parsed as floats, not doubles or string constants, should be the source of the problem

https://github.com/godotengine/godot/blob/71ee65dc5701a0675ae6b1879a694a28c7206a63/servers/rendering/shader_language.h#L481-L488

David-DiGioia commented 1 year ago

@celyk Thanks for the workaround, that's really helpful! Inspecting native shader code seems very useful, where is the option to do that?

So surely this is a bug and should be marked as such right?

Zireael07 commented 1 year ago

@David-DiGioia I don't think it's a bug, I think it's the GLSL spec not being very clear

AThousandShips commented 1 year ago

From the spec floats are single precision, not sure where things go weird but it shouldn't by just rounding the values, though there is some situations with constants that might get different results when compiled due to instruction combining, could be improved by adding our own constant folding to shader compiler

David-DiGioia commented 1 year ago

@Zireael07 I disagree. The GLSL spec says

The precision of highp single- and double-precision floating-point variables is defined by the IEEE 754 standard for 32-bit and 64-bit floating-point numbers.

and the Godot documentation says that highp is the default precision level. Then the precision of n_ is highp and is defined by the IEEE 754 standard for 32-bit floating-point numbers, and according to this standard

0.142857142857 has the hex representation 0x3e124925 0.142857 has the hex representation 0x3e12491b

which do not match. Then it is Godot that is rounding the number and losing precision, not a limitation of 32 bit floats. And technicalities aside, it's a very poor user experience to paste GLSL code and get a different result in Godot than everywhere else that runs that code.

AThousandShips commented 1 year ago

Then that is the issue here, needs a fix somewhere for printing floating point numbers, will look into it

And technicalities aside

Now this is a technicality issue, we can't do more than follow the specification

David-DiGioia commented 1 year ago

This is not an issue of following the specification though, this is an issue with Godot's process of converting the Godot shader into GLSL.

The process as I understand it is: 1) I write a decimal representation of a float in Godot's shader (0.142857142857) 2) Godot parses this and converts it to a float (0x3e124925) 3) Godot creates a GLSL file where it prints this float as a truncated decimal constant (0.142857) 4) The GLSL compiler converts this decimal value into a binary representation (0x3e12491b)

This process of converting back and forth between decimal and binary loses precision and is made worse by the fact Godot's function to print the float is truncating it at 6 decimal places. Ideally Godot would just store the string "0.142857142857" and paste this into GLSL as is. But if not this then at least Godot should be printing maximum precision of the float it has stored.

AThousandShips commented 1 year ago

What I'm saying is that the technicalities are what matters, as I said there is an issue and I've identified it and will look into a fix for it, relating generally to the representation of floats

The reason for this is that rtoss prints 6 decimals only

Now just storing the string might not make sense or work, as it is not just a translator but does various other things afik

The issue is that the way floats are formatted is built in and kind of core and needs a bit of a larger approach and discussion