Samsung / GearVRf

The GearVR framework(GearVRf) is an Open Source VR rendering library for application development on VR-supported Android devices.
http://www.gearvrf.org
Apache License 2.0
407 stars 217 forks source link

GVRAsynchronousResourceLoader: async loader can end up in bad states #405

Closed danke-sra closed 8 years ago

danke-sra commented 8 years ago

Some future tasks (such as textures) do not exit from the normal exit points (loaded or failed). They result in undefined program behavior, corrupted throttler states, hanging threads and so on. And when that happens, new texture cannot be loaded any more. We need to fix the async loader, patch the issues and make it robust/recoverable.

mingsong commented 8 years ago

This is a good point as we have seen some problematic textures before. I will be working on this for the following time.

danke-sra commented 8 years ago

Ok. Please let me know when it is fixed. I am working on another texture-related issue. My patch will depend on this fix, and once this is done I can start to verify my fix.

One issue with the async loader is that it does not wait for the texture to be loaded when you call get() on a Future. I guess that needs to be fixed first. This is in the following class:

class FutureResource<T extends GVRHybridObject> {
    ...
        @Override
        public T get() throws InterruptedException, ExecutionException {
            return get(0);
        }
    ...
}

The get(0) does not return because a callback to the CancelableCallback<T> callback object is not guaranteed. I think the 0 may mean infinity time, but we need to either guarantee it gets a callback within a time limit or it times out by itself for robustness. Since the get() function does not return, the texture is forever unset (green). If I load the texture again, another hanging thread will be added to the thread pool. Please see if there is a way to fix that.

You can reproduce this issue using the DebugServer. It can be enabled by adding gvrContext.startDebugServer() to the onInit(...) method of any sample. Then, you can telnet <IP> 1645 to get the debug shell (cygwin / ubuntu is best, but windows works too). After that, you can do the following to load a model, and see if the async loader works:

gvrf> js
js> s = gvrf.getMainScene()
js> so = gvrf.loadModelFromURL("https://raw.githubusercontent.com/dankex/3dmodels/master/characters/R2D2/R2D2_png.dae")
js> s.addSceneObject(so)

Then, you can see the outline of the model (a robot) but texture is green. If needed, you can remove the model and repeat the above. The debug shell should be quite helpful in this case. :+1:

js> s.removeSceneObject(so)
mingsong commented 8 years ago

Ok, thanks for the suggestions, I will sync up with you whenever I need to verify the texture.

mingsong commented 8 years ago

The problem has pretty much been isolated in AsyncBitmapTexture::loadResource()->decodeStream, where the DecodeHelper may not be properly created based on a regular FileInputStream (e.g, through network TCP/IP connection rather than a AssetsInputStream which results in a OutOfMemoryError when trying to decode the bitmap through the inputStream. The loading thread is thus failed and the separate setting thread is been blocked forever without getting the decoded texture. So I am going to dig out and fix

  1. the DecodeHelper that failed the bitmap decoding and loading
  2. Handling of the blocking on a failed loading thread.
mingsong commented 8 years ago

Possible No 3 is to remove future wrapper completely of a texture without using a dedicate thread to wait until it is ready, and simply use the valid bit of texture in lower layer renderRenderData() to decide whether to render that texture. That is dependent on the texture constructor that if we can create an empty texture first with proper memory allocated, set it and return immediately, and then the loading process in another thread simply load into that memory area rather than create a new one, and finally the loading process will notify the rendering process that the texture finishes loading.

danke-sra commented 8 years ago

The Future class affects many kinds of resources. It is a bigger change that may need more time and investigation. If we can fix the decoder that will be good enough for this issue.

danke-sra commented 8 years ago

Is there a fix that can be tested?

liaxim commented 8 years ago

@danke-sra @mingsong Can we close this now?

mingsong commented 8 years ago

I think so.

danke-sra commented 8 years ago

Yeah, we can close it for now.

I think the issue doesn't happen so often after the fix, but is still possible. However, other issues can be handled separately.