rust-mobile / ndk

Rust bindings to the Android NDK
Apache License 2.0
1.11k stars 110 forks source link

Callback safety #464

Open MarijnS95 opened 8 months ago

MarijnS95 commented 8 months ago

[!IMPORTANT] This report/analysis is unfinished.

In order to properly represent the semantics around callbacks registered to Android via NDK bindings, there are a few things we need to know in order to map them safely, and unfortunately the NDK documentation only barely ever specifies them to us. I've reported this upstream (https://issuetracker.google.com/issues/318944941) and will backlink to this issue, which I'll use to track what we need to know, and what we know for which callbacks.

This issue also serves as a guideline when implementing future APIs that contain callbacks.

What

Which callbacks

[!CAUTION] A bunch of obvious things are not yet filled out in this table. I want to post this before (trivially) testing a few things and adding more callback registration functions.

Name Thread Times called NULL clears callback Repeated call overwrites previous _delete() invalidates callback Cleared on failure Resulting type Track Box with object
AMediaCodec_setAsyncNotifyCallback() One NDK internal thread >0 Yes Probably? Yes Manual check: yes FnMut<Send> Yes
AMediaCodec_setOnFrameRenderedCallback() [^missing] One NDK internal thread ? Yes Probably? Yes ? FnMut<Send> Yes
AAudioStreamBuilder_setDataCallback() and AAudioStreamBuilder_setErrorCallback() "real-time thread, never simultaneously" ? ? ? ? - FnMut<Send> ?
AndroidBitmap_compress() Current thread, while function is called >1 - - - - FnMut No, drop immediately after call
ALooper_addFd() with a callback Looper thread (inferred) ? No [^removeFd] Yes for same FD No [^looperRefcount] ? FnMut for ThreadLooper, FnMut<Send> for ForeignLooper Can't [^looperRefcount]
AImageReader_setBufferRemovedListener() and AImageReader_setImageListener() Manual test: One internal thread >0 Manual test: yes Yes (manual test: race conditions?) Manual test: race conditions in concurrently-received callback! ? FnMut<Send> Probably?
AThermal_registerThermalStatusListener() Manual test: internal thread. Code: callback is guarded by a mutex >0 No; there's an unregister() call No; adding same callback+userdata is an error Yes, behind mutex No FnMut<Send> Probably, to free on Drop if user forgets to unregister?
AChoreographer_postFrameCallback(Delayed)(64)() Looper (current!) thread (don't make Choreographer Send/Sync) 0-1 ? No; each registration fires individually No No failure FnOnce No
AChoreographer_postVsyncCallback() Looper (current!) thread (don't make Choreographer Send/Sync) 0-1 ? No; each registration fires individually No No failure FnOnce No
AChoreographer_registerRefreshRateCallback() Looper (current) thread >0 ? No, assuming fn/userdata ptr changes? Our instance is just a reference No failure FnMut No, Choreographer is just a reference
ASurfaceTransaction_setOnComplete() (One?) NDK internal thread 0-1 ? No, each registration fires individually No No failure FnOnce<Send> No
ASurfaceTransaction_setOnCommit() (One?) NDK internal thread 0-1 ? No, each registration fires individually No No failure FnOnce<Send> No

[^missing]: Missing from the NDK crate [^removeFd]: NULL callback will register a regular non-callback event. It can be cleared via ALooper_removeFd(), but the callback might still run after removal under specific circumstances. [^looperRefcount]: Looper is refcounted (but what if the owning thread disappears or never calls poll()?). It can also be registered from any thread, even if poll() and callback invocations will only happen on the main thread.

ImageReader

This is a strange beast. No docs and it's very easy to trip up by deleting the object or changing the callback at runtime, while a callback may be coming in. I've already locally moved the overwriting of Boxes until after the setxxxListener() call so that we can't drop them too early, but it's still finicky. Not to mention some inconsistencies when dropping the NativeWindow, which we've previously acquired an extra reference on (docs say _release()ing should not be done, but no comment on first acquiring it...).

Also note that we don't mark the type as Send/Sync, but the both callbacks reference a to ImageReader, so it should be thead-safe?

References

454

455