AmbientRun / Ambient

The multiplayer game engine
https://ambient.run
Apache License 2.0
3.8k stars 124 forks source link

Golden image test: basics/decals fails randomly #457

Open ghost opened 1 year ago

ghost commented 1 year ago

Expected image:

screenshot

This test usually passes, but every once in a while this image gets rendered instead:

fail_screenshot

This started happening after #440, but only on Github Actions. On M2 Max the decals example works as expected.

FredrikNoren commented 1 year ago

@VinhTruongAmbient Hm that's so strange with the weird grey box in the background in the second screenshot. Seems like it's failing almost every time on main now

ghost commented 1 year ago

I was able to reproduce this on my local Docker container which runs LLVMPipe. Investigating.

ghost commented 1 year ago

As a test I removed all entities, except for the camera and the decal. The result looks like this:

fail_screenshot

The gray box looks to be oriented by the decal, which makes sense. Removing orientation yields:

fail_screenshot

ghost commented 1 year ago

There seem to be two separate issues:

  1. Our decals use CubeMesh to draw the context in which the decal is rendered. This CubeMesh is actually twice as large in every dimension. The shader expects a cube with min=[-0.5,-0.5,-0.5], max=[0.5,0.5,0.5] but our CubeMesh is actually min=[-1.0,-1.0,-1.0], max=[1.0,1.0,1.0]. 1.1. The above statement is incorrect because the transparent green cube uses the same cube as the decal. Something else however does double the size of the decal volume. 1.2. Actually, both extent=1 and extent=2 cubes are in play here. The green cube is the cube() component, which is extent=1 or the "unit cube". However, the decals code uses the regular "cube", which defaults to the extent=2 cube, but thinks it's using the extent=1 cube. This confusion of cubes caused the mismatch between the two.
  2. get_decal() in decal shader calls discard; if the projected point does not land inside the unit cube. This is incorrect because in this case the fragment never gets written. So the gray background just happens to be some uninitialized render target data.

Working on a fix.

ghost commented 1 year ago

The uninitialized render target explains the random behavior we have seen in the CI.

ghost commented 1 year ago

Looks like the discard theory doesn't tell the whole story. This bug actually happens on my Mac too, you just had to change the clear color from black to something else:

image

We should change our default clear color to something else than pitch black. Helps reveal bugs like this.

ghost commented 1 year ago

Fixed in https://github.com/AmbientRun/Ambient/pull/473.

A summary of the problem: