medic / cht-android

A native Android container for Community Health Toolkit (CHT) applications
GNU Affero General Public License v3.0
25 stars 49 forks source link

Android wrapper crash logging in as Upgrade-Muso-Mali user #65

Open kennsippell opened 5 years ago

kennsippell commented 5 years ago
  1. Run android wrapper on Samsung S5 Neo. Custom build including https://github.com/medic/medic-android/commit/5893f24cdd5b024f4fb67e5108aa582a323d6703. Android 6.
  2. Configure to run with upgrade-muso-mali
  3. Login as user coumbadi

Expected: Login as user Observed: Crash

01-16 16:28:06.966  3310  3310 I chromium: [INFO:CONSOLE(0)] "Application Cache Progress event (32 of 32) ", source: https://upgrade-muso-mali.XXXX/medic/_design/medic/_rewrite/ (0)
01-16 16:28:06.967  3310  3310 I chromium: [INFO:CONSOLE(0)] "Application Cache Cached event", source: https://upgrade-muso-mali.XXXX/medic/_design/medic/_rewrite/ (0)
01-16 16:29:52.154  3310  3355 E chromium: [ERROR:connection.cc(1894)] Quota sqlite error 14, errno 0: unable to open database file, sql: UPDATE OriginInfoTable SET last_modified_time = ? WHERE origin = ? AND type = ?
01-16 16:29:52.166  3310  3384 D         : NULL == surf->write_back_color_buffer
01-16 16:29:52.168  3310  3384 D         : NULL == surf->write_back_color_buffer
01-16 16:29:52.168  3310  3384 E chromium: [ERROR:gles2_cmd_decoder.cc(13477)] Context lost because SwapBuffers failed.
01-16 16:29:52.168  3310  3384 E chromium: [ERROR:gles2_cmd_decoder.cc(5019)] Error: 5 for Command kSwapBuffers
01-16 16:29:52.172  3310  3384 E chromium: [ERROR:texture_manager.cc(2971)] [.Offscreen-MainThread-0xcd87f400.CmdBufferImageTransportFactory-0xca1ab700]GL ERROR :GL_OUT_OF_MEMORY : glTexImage2D: <- error from previous GL command
01-16 16:29:52.289  3310  3384 D mali_winsys: EGLint new_window_surface(egl_winsys_display*, void*, EGLSurface, EGLConfig, egl_winsys_surface**, egl_color_buffer_format*, EGLBoolean) returns 0x3000,  [1080x1848]-format:1
01-16 16:29:52.298  3310  3310 E chromium: [ERROR:compositor_impl_android.cc(585)] Failed to init OutputSurface for compositor.
01-16 16:29:52.298  3310  3310 E chromium: [ERROR:compositor_impl_android.cc(585)] Failed to init OutputSurface for compositor.
01-16 16:29:52.301  3310  3310 E chromium: Failed to read /proc/self/maps
01-16 16:29:52.301  3310  3310 F chromium: [FATAL:compositor_impl_android.cc(586)] Too many context creation failures. Giving up... 
01-16 16:29:52.301  3310  3310 F chromium: 

@derickl also has a repro of this

derickl commented 5 years ago
  1. Run (almost) latest container on Nokia 2.1 (Android 8.1 - Quad-core 1.4 GHz Cortex-A53, 1GB RAM, 8GB storage)
  2. Configure to run with upgrade-muso-mal
  3. Login as user coumbadi

Android Wrapper version 0.4.14 (almost certain) snapshot from GH

I get two crashes (one of which kills the container)

Crash 1 (similar to the original report, causes the app to exit)

01-16 20:05:26.812  1676  2560 E Parcel  : fcntl(F_DUPFD_CLOEXEC) failed in Parcel::read, i is 0, fds[i] is -1, fd_count is 1, error: Too many open files
01-16 20:05:26.857  1676  2560 W Adreno-EGLSUB: <DequeueBuffer:782>: dequeue native buffer fail: Invalid argument, buffer=0x0, handle=0x0
01-16 20:05:26.860  1676  2560 W Adreno-EGL: <qeglDrvAPI_eglSwapBuffers:3833>: EGL_BAD_SURFACE
01-16 20:05:26.872  1676  2560 E chromium: [ERROR:gles2_cmd_decoder.cc(13477)] Context lost because SwapBuffers failed.
01-16 20:05:26.875  1676  2560 E chromium: [ERROR:gles2_cmd_decoder.cc(5019)] Error: 5 for Command kSwapBuffers
01-16 20:05:26.879  1676  2560 E chromium: [ERROR:gles2_cmd_decoder.cc(4009)]   GLES2DecoderImpl: Trying to make lost context current.
01-16 20:05:27.596  1676  2560 W Adreno-EGL: <qeglDrvAPI_eglGetConfigAttrib:612>: EGL_BAD_ATTRIBUTE
01-16 20:05:27.795  1676  2560 W Adreno-ES20: <rb_perfcounter_gpu_disjoint_status:487>: open failed: errno 13
01-16 20:05:27.795  1676  2560 W Adreno-ES20: <rb_perfcounter_gpu_disjoint_status:487>: open failed: errno 13
01-16 20:05:27.792  1676  1676 W Chrome_InProcGp: type=1400 audit(0.0:13134): avc: denied { search } for name="kgsl-3d0" dev="sysfs" ino=21503 scontext=u:r:untrusted_app_25:s0:c512,c768 tcontext=u:object_r:sysfs_kgsl:s0 tclass=dir permissive=0
01-16 20:05:27.794  1676  1676 W Chrome_InProcGp: type=1400 audit(0.0:13135): avc: denied { search } for name="kgsl-3d0" dev="sysfs" ino=21503 scontext=u:r:untrusted_app_25:s0:c512,c768 tcontext=u:object_r:sysfs_kgsl:s0 tclass=dir permissive=0
01-16 20:05:27.815  1676  2560 W Adreno-ES20: <rb_perfcounter_gpu_disjoint_status:487>: open failed: errno 13
01-16 20:05:27.812  1676  1676 W Chrome_InProcGp: type=1400 audit(0.0:13136): avc: denied { search } for name="kgsl-3d0" dev="sysfs" ino=21503 scontext=u:r:untrusted_app_25:s0:c512,c768 tcontext=u:object_r:sysfs_kgsl:s0 tclass=dir permissive=0
01-16 20:05:27.829  1676  1676 E chromium: [ERROR:compositor_impl_android.cc(585)] Failed to init OutputSurface for compositor.
01-16 20:05:27.829  1676  1676 E chromium: [ERROR:compositor_impl_android.cc(585)] Failed to init OutputSurface for compositor.
01-16 20:05:27.848  1676  1676 E chromium: Failed to read /proc/self/maps
01-16 20:05:27.848  1676  1676 F chromium: [FATAL:compositor_impl_android.cc(586)] Too many context creation failures. Giving up...
01-16 20:05:27.848  1676  1676 F chromium:
01-16 20:05:27.987  1676  2499 E chromium: [ERROR:connection.cc(1894)] Quota sqlite error 14, errno 0: unable to open database file, sql: UPDATE OriginInfoTable SET used_count = ?, last_access_time = ? WHERE origin = ? AND type = ?
01-16 20:05:27.987  1676  2499 E chromium: [ERROR:connection.cc(1894)] Quota sqlite error 14, errno 0: unable to open database file, sql: UPDATE OriginInfoTable SET last_modified_time = ? WHERE origin = ? AND type = ?

Crash 2

01-16 19:55:52.412  1676  2560 W Adreno-GSL: <ioctl_kgsl_syncobj_create:3979>: (30, 8, 90492) fail 24 Too many open files
01-16 19:55:52.412  1676  2560 W Adreno-EGLSUB: <SwapBuffers:1433>: gsl_device_3d_add_fence_event failed
01-16 19:55:52.412  1676  2560 W Adreno-EGL: <qeglDrvAPI_eglSwapBuffers:3938>: EGL_BAD_SURFACE
01-16 19:55:52.429  1676  2560 W Adreno-GSL: <gsl_ldd_control:541>: ioctl fd 47 code 0xc0140933 (IOCTL_KGSL_TIMESTAMP_EVENT) failed: errno 24 Too many open files
01-16 19:55:52.429  1676  2560 W Adreno-GSL: <ioctl_kgsl_syncobj_create:3979>: (30, 8, 90497) fail 24 Too many open files
01-16 19:55:52.429  1676  2560 W Adreno-EGLSUB: <SwapBuffers:1433>: gsl_device_3d_add_fence_event failed
01-16 19:55:52.429  1676  2560 W Adreno-EGL: <qeglDrvAPI_eglSwapBuffers:3938>: EGL_BAD_SURFACE
01-16 19:55:52.933  1676  2560 E Surface : queueBuffer: error queuing buffer to SurfaceTexture, -22
01-16 19:55:52.933  1676  2560 W Adreno-EGLSUB: <SwapBuffers:1439>: failed to queueBuffer
01-16 19:55:52.933  1676  2560 W Adreno-EGL: <qeglDrvAPI_eglSwapBuffers:3938>: EGL_BAD_SURFACE
01-16 19:55:52.948  1676  2560 E Surface : queueBuffer: error queuing buffer to SurfaceTexture, -22
01-16 19:55:52.948  1676  2560 W Adreno-EGLSUB: <SwapBuffers:1439>: failed to queueBuffer
01-16 19:55:52.948  1676  2560 W Adreno-EGL: <qeglDrvAPI_eglSwapBuffers:3938>: EGL_BAD_SURFACE
01-16 19:55:52.963  1676  2560 W Adreno-GSL: <gsl_ldd_control:541>: ioctl fd 47 code 0xc0140933 (IOCTL_KGSL_TIMESTAMP_EVENT) failed: errno 24 Too many open files
01-16 19:55:52.963  1676  2560 W Adreno-GSL: <ioctl_kgsl_syncobj_create:3979>: (30, 8, 90657) fail 24 Too many open files
01-16 19:55:52.963  1676  2560 W Adreno-EGLSUB: <SwapBuffers:1433>: gsl_device_3d_add_fence_event failed
01-16 19:55:52.963  1676  2560 W Adreno-EGL: <qeglDrvAPI_eglSwapBuffers:3938>: EGL_BAD_SURFACE
01-16 19:55:52.981  1676  2560 W Adreno-GSL: <gsl_ldd_control:541>: ioctl fd 47 code 0xc0140933 (IOCTL_KGSL_TIMESTAMP_EVENT) failed: errno 24 Too many open files
01-16 19:55:52.981  1676  2560 W Adreno-GSL: <ioctl_kgsl_syncobj_create:3979>: (30, 8, 90662) fail 24 Too many open files
01-16 19:55:52.981  1676  2560 W Adreno-EGLSUB: <SwapBuffers:1433>: gsl_device_3d_add_fence_event failed
01-16 19:55:52.981  1676  2560 W Adreno-EGL: <qeglDrvAPI_eglSwapBuffers:3938>: EGL_BAD_SURFACE
01-16 19:55:54.271  1676  1676 I chromium: [INFO:CONSOLE(3)] "Error fetching ddoc from remote server", source: https://upgrade-muso-mali.XXXX/medic/_design/medic/_rewrite/js/inbox.js (3)

What was interesting is that the app picks up from where it left off a couple of minutes after Crash 2 but thereafter crashes and fully exits.

I'll attach the detailed log file on slack.

abbyad commented 5 years ago

What is the actual version of the Android wrapper app medic-android?

dianabarsan commented 5 years ago

I got a similar error back in November when we were doing perf investigations:

11-16 12:39:27.732 4371-4572/org.medicmobile.webapp.mobile E/[EGL-ERROR]: void __egl_platform_dequeue_buffer(egl_surface*):1780: failed to dequeue buffer from native window 0x5a0daca8; err = -16, buf = 0x0,max_allowed_dequeued_buffers 3
11-16 12:39:27.740 4371-4572/org.medicmobile.webapp.mobile E/chromium: [ERROR:gles2_cmd_decoder.cc(13477)] Context lost because SwapBuffers failed.
11-16 12:39:27.742 4371-4572/org.medicmobile.webapp.mobile E/chromium: [ERROR:gles2_cmd_decoder.cc(5019)] Error: 5 for Command kSwapBuffers
11-16 12:39:27.793 4371-4572/org.medicmobile.webapp.mobile E/Surface: dequeueBuffer: IGraphicBufferProducer::requestBuffer failed: -2147483646
11-16 12:39:27.793 4371-4572/org.medicmobile.webapp.mobile E/[EGL-ERROR]: void __egl_platform_dequeue_buffer(egl_surface*):1780: failed to dequeue buffer from native window 0x5a0daca8; err = -2147483646, buf = 0x0,max_allowed_dequeued_buffers 3
11-16 12:39:27.802 4371-4572/org.medicmobile.webapp.mobile E/libEGL: eglMakeCurrent:775 error 3003 (EGL_BAD_ALLOC)
11-16 12:39:27.802 4371-4572/org.medicmobile.webapp.mobile E/chromium: [ERROR:gpu_command_buffer_stub.cc(577)] Failed to make context current.
11-16 12:39:27.808 4371-4371/org.medicmobile.webapp.mobile E/chromium: [ERROR:compositor_impl_android.cc(585)] Failed to init OutputSurface for compositor.
11-16 12:39:27.815 4371-4572/org.medicmobile.webapp.mobile E/Surface: dequeueBuffer: IGraphicBufferProducer::requestBuffer failed: -2147483646
11-16 12:39:27.817 145-12849/? E/BufferQueue: [SurfaceView](this:0xb8b14d50,id:131,api:1,p:4371,c:145) setBufferCount: client owns some buffers
11-16 12:39:27.817 4371-4572/org.medicmobile.webapp.mobile E/Surface: IGraphicBufferProducer::setBufferCount(4) returned Invalid argument
11-16 12:39:27.825 4371-4572/org.medicmobile.webapp.mobile E/Surface: dequeueBuffer: IGraphicBufferProducer::requestBuffer failed: -2147483646
11-16 12:39:27.825 4371-4572/org.medicmobile.webapp.mobile E/[EGL-ERROR]: void __egl_platform_dequeue_buffer(egl_surface*):1780: failed to dequeue buffer from native window 0x5a0daca8; err = -2147483646, buf = 0x0,max_allowed_dequeued_buffers 3
11-16 12:39:27.835 4371-4572/org.medicmobile.webapp.mobile E/libEGL: eglMakeCurrent:775 error 3003 (EGL_BAD_ALLOC)
11-16 12:39:27.835 4371-4572/org.medicmobile.webapp.mobile E/chromium: [ERROR:gpu_command_buffer_stub.cc(577)] Failed to make context current.
11-16 12:39:27.838 4371-4371/org.medicmobile.webapp.mobile E/chromium: [ERROR:compositor_impl_android.cc(585)] Failed to init OutputSurface for compositor.
11-16 12:39:27.940 4371-4371/org.medicmobile.webapp.mobile A/chromium: [FATAL:compositor_impl_android.cc(586)] Too many context creation failures. Giving up... 
11-16 12:39:27.941 4371-4371/org.medicmobile.webapp.mobile A/libc: Fatal signal 6 (SIGABRT) at 0x00001113 (code=-6), thread 4371 (e.webapp.mobile)
11-16 12:39:30.386 776-888/system_process E/NativeCrashListener: Exception dealing with report
    libcore.io.ErrnoException: read failed: EAGAIN (Try again)
        at libcore.io.Posix.readBytes(Native Method)
        at libcore.io.Posix.read(Posix.java:128)
        at libcore.io.BlockGuardOs.read(BlockGuardOs.java:149)
        at com.android.server.am.NativeCrashListener.consumeNativeCrashData(NativeCrashListener.java:240)
        at com.android.server.am.NativeCrashListener.run(NativeCrashListener.java:138)
11-16 12:39:39.754 776-811/system_process E/InputDispatcher: channel '4229bf40 org.medicmobile.webapp.mobile/org.medicmobile.webapp.mobile.EmbeddedBrowserActivity (server)' ~ Channel is unrecoverably broken and will be disposed!

My app, however, never recovered from this error. I had to clear the data and start all over again. The error happened repeatedly and I tried repeatedly to resuscitate, to no avail.

The app was running on the Tecno Y4 (Android 4.4.2). I can't say what version of the wrapper I was running, I had built it myself from master some time before, but I don't know exactly when.

derickl commented 5 years ago

Updated the my comment to include the wrapper version number.

kennsippell commented 5 years ago

@abbyad My build included everything up to https://github.com/medic/medic-android/commit/5893f24cdd5b024f4fb67e5108aa582a323d6703. So latest HEAD -1 commit. Updated original bug report above.

garethbowen commented 5 years ago

My analysis thus far...

The android errors that Derick reported seem to have their root cause in Too many open files. I haven't seen this before and the only relevant mention I can find online is to do with SD cards which are apparently limited to 1024 open file handles system wide. Derick mentioned he was using an SD card so this is my main suspect. The workaround is to restart until it works, or don't use an SD card. I'm not too worried about this because I'm not aware of many projects using SD cards.

Derick went on to report javascript errors from the webapp which I summarise here:

01-16 22:54:42.640  3677  3677 I chromium: [INFO:CONSOLE(3)] "Uncaught InvalidStateError: Failed to execute 'index' on 'IDBObjectStore': The transaction has finished.", source: https://upgrade-muso-mali.app.medicmobile.org/medic/_design/medic/_rewrite/js/inbox.js (3)
  --- around 200 more errors just the same
01-16 22:54:44.258  3677  3677 I chromium: [INFO:CONSOLE(3)] "Database has a global failure", source: https://upgrade-muso-mali.app.medicmobile.org/medic/_design/medic/_rewrite/js/inbox.js (3)
...
01-16 22:54:44.605  3677  3677 I chromium: [INFO:CONSOLE(3)] "Uncaught (in promise) InvalidStateError: Failed to execute 'transaction' on 'IDBDatabase': The database connection is closing.", source: https://upgrade-muso-mali.app.medicmobile.org/medic/_design/medic/_rewrite/js/inbox.js (3)

The root of this appears to be the transaction aborting for some unknown reason. Our code doesn't abort so it's either a side effect or something the browser is doing. Unfortunately there's very little information about the underlying cause. Derick is attempting a refresh and will report back if that works around this issue.

derickl commented 5 years ago

Logging exited 5 min after I signed out for the day... (It hard to tell whether it went smoothly or recovered gracefully through it all)

garethbowen commented 2 years ago

This is not a blocker for 1.0.0