mrdoob / three.js

JavaScript 3D Library.
https://threejs.org/
MIT License
102.81k stars 35.38k forks source link

Depth pass unnecessarily low precision on Apple iPad Air 2 #9092

Closed bhouston closed 4 years ago

bhouston commented 8 years ago
Description of the problem

The issue is that the dpeth pass has unecessarily low precision even with RGBA packing on some mobile devices, in particular the Apple iPad Air 2.

This example, set it to use the depth RGBA output.

http://threejs.orgexamples/#webgl_materials_channels

This is what it should look like (all channels are used):

image

This is what it ends up looking like on the Apple iPad Air 2 device (only blur channel and I think alpha channel are used):

img_0011

So basically on lowp / mediump devices, depth RGBA pass, which is used by shadows and post effects does not function properly.

Three.js version

Apple iPad Air 2.

bhouston commented 8 years ago

Interesting thing, if I use the old mod approach to packing depth, I get a bunch more precision (at least a couple more bits):

const highp vec4 bit_shift = vec4( 256.0 * 256.0 * 256.0, 256.0 * 256.0, 256.0, 1.0 );
  const highp vec4 bit_mask = vec4( 0.0, 1.0 / 256.0, 1.0 / 256.0, 1.0 / 256.0 );
  highp vec4 res = mod( v * bit_shift * vec4( 255 ), vec4( 256 ) ) / vec4( 255 );
  res -= res.xxyz * bit_mask;
  return res;

The result when using the above packing method:

img_0012

mrdoob commented 8 years ago

/ping @tschw

tschw commented 8 years ago

highp is not guaranteed to be available and may cause a compile error.

Interesting thing, if I use the old mod approach to packing depth, I get a bunch more precision (at least a couple more bits):

Seems to be about numeric error. Unlike the decoder, the encoder wasn't broken, so I guess it's a good idea to do this revert, as the old implementation appears to be more robust under limited precision. However, I wonder whether there is a way to express it in a more readable way, somehow... Maybe a comment about 255 vs 256 will do - it's rather subtle...

cnspaha commented 6 years ago

any update on this?

bhouston commented 6 years ago

@cnspaha My recommendation is to not use RGBA encoded depth on iOS. Even the revert doesn't actually fix the issue.

cnspaha commented 6 years ago

yeah, but so the whole SAOPass isn't usable as it is on iOS, is there? So the "solution" would be to copy it and modify the depth part of it until it works?

bhouston commented 6 years ago

I wrote the SAOPass and yeah we disable it by default on iOS. The reason is two fold, one the depth rgba encoding precision issues and also because it is very slow.

bhouston commented 6 years ago

I believe the issue is that the shading calculations are done in FP16 on iOS and our RGBA depth coding assumes FP32 precision or similar. IF you can rewrite the math to work on FP16 precision, it may work. I Suspect it will not fill up all RGBA channels but you may get two bytes of precision in depth. I believe on iOS the z-buffer by default is only 16 bits or 24 bits unsigned.

I look forward to further development here. IT would be great to have a solution.

bhouston commented 6 years ago

@cnspaha Apparently SAO on Sketchfab works well on iOS: https://skfb.ly/6wMJ8 I wonder how they achieve accuracy in the depth pass?

Mugen87 commented 6 years ago

FF Shader console shows the following encode/decode functions. Could these be helpful for this issue?

float decodeFloatRGBA( vec4 rgba ) {
    return dot( rgba, vec4(1.0, 1.0/255.0, 1.0/65025.0, 1.0/16581375.0) );
}

vec4 encodeFloatRGBA( float v ) {
    vec4 enc = vec4(1.0, 255.0, 65025.0, 16581375.0) * v;
    enc = fract(enc);
    enc -= enc.yzww * vec4(1.0/255.0,1.0/255.0,1.0/255.0,0.0);
    return enc;
}

vec2 decodeHalfFloatRGBA( vec4 rgba ) {
    return vec2(rgba.x + (rgba.y / 255.0), rgba.z + (rgba.w / 255.0));
}

vec4 encodeHalfFloatRGBA( vec2 v ) {
    const vec2 bias = vec2(1.0 / 255.0, 0.0);
    vec4 enc;
    enc.xy = vec2(v.x, fract(v.x * 255.0));
    enc.xy = enc.xy - (enc.yy * bias);

    enc.zw = vec2(v.y, fract(v.y * 255.0));
    enc.zw = enc.zw - (enc.ww * bias);
    return enc;
}
bhouston commented 6 years ago

Interesting. So I guess we have the option of FP32 into 4 bytes, FP24 into 3 bytes and FP16 into 2 bytes. I Guess we should have the option to pick which resolution we want? I wonder if we can make them compatible with each other, so that FP16 encoded and unpacked by FP24 is still valid, just with reduced precision.

Mugen87 commented 6 years ago

decodeFloatRGBA() and encodeFloatRGBA() are similar to our unpackRGBAToDepth() and packDepthToRGBA() functions and I think I understand the parametrization. But why this signature?

vec4 encodeHalfFloatRGBA( vec2 v )

The given half float value is a vec2? Why not just a float?

bhouston commented 6 years ago

IT is encoding two fp16 floats at once into a single vec4 and vice versa. (vec2 of fp16) <-> vec4

Mugen87 commented 6 years ago

Ah, makes sense 😊. Thx!

bhouston commented 6 years ago

Here is some glsl that will encode either 1, 2, 3 or 4 bytes of depth information (properly rounded to the requested precision) with a decode that is compatible with any of these precisions.


#define DEPTH_PACK_BYTES 2

float unpackRGBAToDepth( const in highp vec4 v ) {

  const highp vec4 unpackFactor =1.0 / vec4(1.0, 255.0, 255.0*255.0, 255.0*255.0*255.0 );
    highp float depth = dot( v.wzyx, unpackFactor );

  return depth;
}

vec4 packDepthToRGBA( in highp float v ) {

  const highp vec4 packFactor = vec4(1.0, 255.0, 255.0*255.0, 255.0*255.0*255.0);

  #if DEPTH_PACK_BYTES == 4
    // do nothing.
    v += 0.5 / ( 255.0*255.0*0.255*0.255 );
  #elif DEPTH_PACK_BYTES == 3
    v += 0.5 / ( 255.0*255.0*0.255 );
  #elif DEPTH_PACK_BYTES == 2
    v += 0.5 / ( 255.0*255.0 );
  #elif DEPTH_PACK_BYTES == 1
    v += 0.5 / ( 255.0 );
  #endif

   highp vec4 res = fract( v * packFactor );
    res.xyz -= res.yzw * (1.0/255.0);

  #if DEPTH_PACK_BYTES == 4
    // do nothing.
  #elif DEPTH_PACK_BYTES == 3
    res.w = 0.0;
  #elif DEPTH_PACK_BYTES == 2
    res.zw = vec2( 0.0 );
  #elif DEPTH_PACK_BYTES == 1
    res.yzw = vec3( 0.0 );
  #endif
  res.xyzw = res.wzyx;

  return res;

}

Funny thing, I tested it on desktop and it produces acceptable results in SAO with 2, 3 and 4 bytes of depth precision. 1 byte of precision produces bad results.

But 2 bytes of precision on iOS produces artifacts. This suggests that the issue is sort of complex. Either it is overflowing during the packing equation or an input value to the packing already has insufficient precision, or it is a precision issue inside of SAO itself and this whole depth rgba packing is a wild goose chase.

bhouston commented 6 years ago

I can confirm via the example webgl_materials_channels.html that the resulting depth buffer RGBA packing appears to be identical on iOS and desktop. Thus this leads me to believe the issue isn't in the RGBA packing but either in the precision of the input value (although this example seems to suggest it is at least roughly right), or it is the precision of something it is being compared against.

Currently we are encoding gl_FragCoord.z. Could there be precisions issues with that? Maybe we could encode something else?

The issues seem to be more severe further away from the camera, less so close up.

I think that gl_FragCoord.z is normalized between 0 and 1 and that it is biased towards the near plane. Maybe instead of this term, we could just take the distance between the near and far plane and normalize that directly, rather than gl_FragCoord.z.

Basically ( ( vPosition.z - near ) / ( far - near ) ) = depth.

bhouston commented 6 years ago

The other solution is to use an adaptive bias in the SAO calculation that is less sensitive away from the camera to match the precision of the gl_FragCoord.z.

bhouston commented 6 years ago

I think that is now the core issue, a lack of agreement in terms of the bias (which is constant across the near-far range) and gl_FragCoord.z (which has significantly more precision at the near plane.) Thus there are two solutions to fixing that, make bias adaptive and favoring the near plane, or make the precision of our depth buffer constant across the near-far range. Mathematically, it is easier to make the precision depth buffer fairly constant across the near-far range, so I'll try that.

bhouston commented 6 years ago

I've run out of time to explore this further, my apologies. I think my suggested depth packing routines pasted above are pretty decent and I'd recommend their adoption in Three.Js. Maybe just optimize out that swizzle that I do.

bhouston commented 6 years ago

I did some quick tests and found that the depth precision I am seeing on iOS is 1 full byte plus 3 bits of the next byte. That is 11 bits in total. The mantissa of a fp16 happens to be 11 bits: https://en.wikipedia.org/wiki/Half-precision_floating-point_format I am not sure if that is a coincidence or whether this identify that the core issue is that the gl_FragCoord.z value being written to the depth RGBA buffer is of limited precision.

bhouston commented 6 years ago

According to the WebGL 1.0 cheat sheet here:

https://www.khronos.org/files/webgl/webgl-reference-card-1_0.pdf

The precision of gl_FragCoord is "mediump vec4 gl_FragCoord;". WHich would be the issue. mediump on iOS is 16 bits.

Mugen87 commented 6 years ago

https://github.com/mrdoob/three.js/blob/707b44a18c30161efdd2023247af88a4bde8d302/src/renderers/shaders/ShaderLib/depth_frag.glsl#L37-L41

So instead of using gl_FragCoord.z at this place (which is of limited precision), we would calculate a custom (high precision) depth value, right?

bhouston commented 6 years ago

Well, if we can calculate the exactly equivalent to gl_FragCoord.z but using uniforms that are highp in the depth material, and use that manually calculated value instead of gl_FragCoord.z, it would fix our issue.

I believe to create it we can do this:

// depth_vert.glsl
#include <common>
#include <packing>
#include <uv_pars_vertex>
#include <displacementmap_pars_vertex>
#include <morphtarget_pars_vertex>
#include <skinning_pars_vertex>
#include <logdepthbuf_pars_vertex>
#include <clipping_planes_pars_vertex>

varying vec4 clipSpacePosition;

void main() {

    #include <uv_vertex>

    #include <skinbase_vertex>

    #include <begin_vertex>
    #include <displacementmap_vertex>
    #include <morphtarget_vertex>
    #include <skinning_vertex>
    #include <project_vertex>
    #include <logdepthbuf_vertex>
    #include <clipping_planes_vertex>

    // https://stackoverflow.com/a/12904072
    //vec4 eye_space_pos = gl_ModelViewMatrix * /*something*/
    //vec4 clip_space_pos = gl_ProjectionMatrix * eye_space_pos;
    clipSpacePosition = projectionMatrix * mvPosition;
}
// depth_fragment.glsl
precision highp float;

#if DEPTH_PACKING == 3200

    uniform float opacity;

#endif

#include <common>
#include <packing>
#include <uv_pars_fragment>
#include <map_pars_fragment>
#include <alphamap_pars_fragment>
#include <logdepthbuf_pars_fragment>
#include <clipping_planes_pars_fragment>

varying vec4 clipSpacePosition;

void main() {

    #include <clipping_planes_fragment>

    vec4 diffuseColor = vec4( 1.0 );

    #if DEPTH_PACKING == 3200

        diffuseColor.a = opacity;

    #endif

    #include <map_fragment>
    #include <alphamap_fragment>
    #include <alphatest_fragment>

    #include <logdepthbuf_fragment>

    // https://stackoverflow.com/a/12904072
    float far=gl_DepthRange.far; float near=gl_DepthRange.near;
    float ndc_depth = clipSpacePosition.z / clipSpacePosition.w;

    float fragCoordZ = (((far-near) * ndc_depth) + near + far) / 2.0;

    #if DEPTH_PACKING == 3200

        gl_FragColor = vec4( vec3( fragCoordZ ), opacity );

    #elif DEPTH_PACKING == 3201

        gl_FragColor = packDepthToRGBA( fragCoordZ );

    #endif

}

If you can check my math, it would be appreciated.

mrdoob commented 6 years ago

/cc @WestLangley

bhouston commented 6 years ago

Here is what the depth channel now looks like with my proposed fix above -- basically perfect at first glance: unnamed 1

But it isn't actually perfect on iOS. I've created an interactive test that shows that the issue is mostly fixed except for a few discontinuities in the depth map on iOS - a discontinuity in the depth render target.

I am sort of stumped. I created a branch here with my fix above: https://github.com/bhouston/three.js/tree/depth_precision_test

But this example demonstrates the new issue with the discontinuity. To make it very clear I reconstruct the normals from the derivatives of the depth map and on iOS you see clear the discontinuities.

@WestLangley @tschw either you up for a challenge? Anyone else?

https://rawgit.com/bhouston/three.js/depth_precision_test/examples/webgl_materials_depth_precision.html

Perfect results on desktop and all andriod devices: image

But on iOS (iPad Air 2) you can see the discontinuity as a line in the reconstructed normals from the depth render target (there are a few lines actually at different depths):

img_0069

bhouston commented 6 years ago

If I change the depth encoding from the original base 256 to a modified version I created that is base 255, the artifact changes location on the mesh on iOS. In the code you can switch between them via the "USE_DEPTH_PACK_256" define in packing.glsl.

This leads me to think that our encoder/decoder is slightly incorrect or at least suffers from precision issues?

bhouston commented 6 years ago

If I turn up the sensititive to discontinuities I see them everywhere on iOS's depth map:

unnamed 2

unnamed 3

An no sign of discontinuities on desktop at all.

bhouston commented 6 years ago

I quickly looked at Sketchfab and they have similar banding artifacts in their SAO implementation on iOS, which I suspect have the same origin, imprecision in the depth map:

image

mrdoob commented 6 years ago

Thanks for investigating this @bhouston!

characteranimators commented 6 years ago

I'm having an issue that i'm not sure is related to this - check out https://stackoverflow.com/questions/51465267/threejs-shadow-artifact-ios . The confusing part is that the currently online threejs examples work for me on iPad (with shadows) but my code isnt working on iPad (desktop is fine). I'm using threejs r90.

mrdoob commented 6 years ago

@characteranimators Please, only post when you're sure the issue is related. For help requests use the forum instead.

characteranimators commented 6 years ago

@mrdoob sure no problems. one thing, was this original issue from 2016 resolved? Or rather, have there been any code changes around depth packing / shadow maps etc for iOS devices - this will give me some direction for further investigation. (Maybe this is a question for @bhouston?)

mrdoob commented 6 years ago

@characteranimators please, use the forum for help.

DanielSturk commented 5 years ago

I likely found the error causing the lines in this image 36545831-d3a2995a-17b7-11e8-9733-6dd5a42b1a8b Uncoincidentally I was working on SAO and got the same problem on iOS, here IMG_0009 After a lot of debugging, I found that on iOS (iPad 1), the values read from the type:UnsignedByteType buffer I'm writing to / reading from had an error of about 10% / 255. All I had to do was inject v = round( v * 255. ) / 255.; at the beginning of the depth unpacking to fix this (round(x) = floor(x+.5)). IMG_0010 iOS uses 16-bit depth buffers, so they probably use 16-bit floats when texture2D() is converting from an UnsignedByteType to a float. 16-bit floats only have 10 bits of precision, or about 2 spare bits, so it's not surprising these values are slightly off.

munrocket commented 5 years ago

Hi guys, I made some investigation of same issue #17935 and realize that this "tidy" algorithm not robust with cyclic GPGPU calculations, in this codepen https://codepen.io/munrocket/pen/vYYvXaE?editors=0010 I am created float16 modification of packDepthToRGBA and as you can see we have artifacts on a target devices. But with another "+0.5" approach I can solve this problem. Below I am take “tidy” algorithm that converts [0..1] -> RGBA and apply "+0.5" approach to fix it.

“tidy” algorithm with "0.5" approach: https://munrocket.github.io/three.js/examples/webgl_materials_channels.html https://munrocket.github.io/three.js/examples/webgl_postprocessing_dof.html https://munrocket.github.io/three.js/examples/webgl_postprocessing_sao.html

Original version: https://threejs.org/examples/webgl_materials_channels.html https://threejs.org/examples/webgl_postprocessing_dof.html https://threejs.org/examples/webgl_postprocessing_sao.html

P.S. @bhouston can you check on your examples? Seems that we need to create one dimensional GPGPU example too.

P.P.S Looks like @DanielSturk suggested the same and lines is gone.

munrocket commented 5 years ago

I am reverted all fixes with 0.5 rounding, because I don’t have real evidence with sao/dof/channels examples on my iPhone (gpgpu was fixed for me, but vsm not). If you have artifacts with lines on examples with packing please share! And we will check this approach again.

munrocket commented 4 years ago

Can somebody solve this problem correctly? Can we announce some prize who will solve packing / unpacking problem? This is a question that important in GPGPU, machine learning and 3d physics.

Mugen87 commented 4 years ago

Fixed via #18696.