mrdoob / three.js

JavaScript 3D Library.
https://threejs.org/
MIT License
101.79k stars 35.31k forks source link

Version 168 accessing undefined variable in onContextRestore #29434

Open andrewhodel opened 2 days ago

andrewhodel commented 2 days ago

Description

https://github.com/mrdoob/three.js/blob/111670613f1c475ed6e9c2d15056f960f0ad6f1b/build/three.module.js#L29320

In three.module.min.js this code is minified and as shown in this screenshot:

Screenshot 2024-09-17 at 6 03 55 PM

info.autoReset or tt.autoReset is not accessible as info or tt is undefined.

Reproduction steps

  1. Add a group of meshes.
  2. Remove group.
  3. Readd group with slightly modified meshes, more until the context is taken.

Code

This is only a group of meshes displayed and removed repeatedly.

Once the WebGL context is lost, this error happens.

Live example

Screenshots

No response

Version

168

Device

Mobile

Browser

Safari

OS

iOS

andrewhodel commented 2 days ago

This is the whole console output when this happens, I can repeat it.

Screenshot 2024-09-17 at 6 34 57 PM

This is the minified code of the other error where a property of null is being accessed.

Screenshot 2024-09-17 at 6 35 44 PM
Mugen87 commented 2 days ago

I suspect the error happens because your system produces a WebGL context loss in the first place. The engine automatically tries to restore the context ( onContextRestore() ) but that fails since for some reasons the gl reference becomes null on your system. That means getShaderPrecisionFormat() fails which breaks the creation of a new info object. And that leads to the runtime error from your first screenshot in onContextRestore().

What console output do you get with the following fiddle? https://jsfiddle.net/jkod6qz2/

andrewhodel commented 1 day ago

@Mugen87 why are you suspecting it? I hate to hear that because people in the area I live say that about any thing and are always talking about how they persecute people through the processes of the product's work, then say they are suspecting it.

The real truth the majority of the time is that they are eventually unable to explain the difference.

You may not be that way, but the truth of this issue is that when the context is lost the three.js module should not continue trying to access variables and properties that are not available.

If there is a WebGL context loss, any running loop, event, interval or callback that attempts to access that info object should check with if (info === null) {}.

Why would you not fix that problem, the system is allowed to lose the WebGL context and the three.js module should not fail or lock the cpu when the WebGL context is lost.

Mugen87 commented 1 day ago

Would you please answer my question and tell us what console output you are getting on your system?

three.js accesses the WebGL context so many times that checks for null or undefined at every occasion would only slow down performance. However, if the gl reference is null right after the context lost, we maybe can throw an exception in onContextRestore().

andrewhodel commented 1 day ago

Would you please answer my question and tell us what console output you are getting on your system?

Next time it happens and the Safari Web Inspector doesn't close by itself before then, sure. It's a cable connected iPad and the console is on a laptop sitting next to it. There is no console on the iPad itself and the "remote" one closes without reporting why and preserve logs doesn't do anything if the iPad safari refreshes before I look at it.

three.js accesses the WebGL context so many times that checks for null or undefined at every occasion would only slow down performance. However, if the gl reference is null right after the context lost, we maybe can throw an exception in onContextRestore().

The checks for null and undefined are not a waste of time, the performance slow down that is real is why you put a try catch in the animation loop and that solves the problem but breaks performance while the context has not been lost.

That is why you need to use === instead of == because it does a type check first and that is very fast.

It's really about mapping out each possibility and testing for it, because that can be known but is much work.

I think you are only considering the word performance and not the actual amount of time taken by a === check, the check time difference is not measurable, even when using performance.now() in Javascript.

It would be a different truth if you were using typeof or Object.keys(object).length, but that is not what is happening here.

andrewhodel commented 1 day ago

Even if you Cross-origin isolate your site using the [Cross-Origin-Opener-Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cross-Origin-Opener-Policy) and [Cross-Origin-Embedder-Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cross-Origin-Embedder-Policy) headers to have a resolution of 5 microseconds with DOMHighResTimeStamp, you still don't have a measurable impact with a === check.

This you learn when you try to get hardware instruction counters and find out that they don't exist in Javascript. Javascript's performance.now() can return the same value from different invocations/executions. That is true of any monotonic clock by the definition of monotonic clock and it's why you need hardware instruction counters that are from assembly when you want a measurement or number that is between in time or of in count of instructions.

https://man7.org/linux/man-pages/man2/perf_event_open.2.html

andrewhodel commented 1 day ago

@Mugen87

I suspect the error happens because your system produces a WebGL context loss in the first place. The engine automatically tries to restore the context ( onContextRestore() ) but that fails since for some reasons the gl reference becomes null on your system. That means getShaderPrecisionFormat() fails which breaks the creation of a new info object. And that leads to the runtime error from your first screenshot in onContextRestore().

What console output do you get with the following fiddle? https://jsfiddle.net/jkod6qz2/

That fiddle does not remove and re-add renderer.domElement, does not modify the displayed meshes at regular intervals and the meshes don't have the density of the ones I am displaying. The fiddle does not cause an error.

The 2 invalid variable access errors I showed you already remain to happen when the three.js code removes and re-adds renderer.domElement, modifies the displayed meshes at regular intervals and the meshes have the density.

andrewhodel commented 1 day ago

@Mugen87

This is really all that's needed.

    function getMaxPrecision( precision ) {

                 if (gl === null) {
                   return null;
                 }

        if ( precision === 'highp' ) {

            if ( gl.getShaderPrecisionFormat( gl.VERTEX_SHADER, gl.HIGH_FLOAT ).precision > 0 &&
                gl.getShaderPrecisionFormat( gl.FRAGMENT_SHADER, gl.HIGH_FLOAT ).precision > 0 ) {

                return 'highp';

            }

            precision = 'mediump';

        }

        if ( precision === 'mediump' ) {

            if ( gl.getShaderPrecisionFormat( gl.VERTEX_SHADER, gl.MEDIUM_FLOAT ).precision > 0 &&
                gl.getShaderPrecisionFormat( gl.FRAGMENT_SHADER, gl.MEDIUM_FLOAT ).precision > 0 ) {

                return 'mediump';

            }

        }

        return 'lowp';

    }

getMaxPrecision is only used:

  1. https://github.com/mrdoob/three.js/blob/01359e00034058c3c1ab037daae97c670f8228ac/build/three.module.js#L15879
  2. https://github.com/mrdoob/three.js/blob/01359e00034058c3c1ab037daae97c670f8228ac/build/three.module.js#L20538

A return value of null is supported in both those code blocks.

andrewhodel commented 1 day ago

I made those changes to getMaxPrecision() in my local repository.

Once some days of testing pass, I'll update you with the results @Mugen87.

It is only happening on the iPad in Safari with the latest iOS 18 per the hardware I have available for testing.

andrewhodel commented 11 hours ago

@Mugen87

Into the second day now without refreshing the document, it sure has not happened again with the if (gl === null) {return null;} check.