MobileNativeFoundation / Store

A Kotlin Multiplatform library for building network-resilient applications
https://mobilenativefoundation.github.io/Store/
Apache License 2.0
3.18k stars 202 forks source link

[BUG] ANR due to ConcurrentHashMap lock in `RealCache` #182

Open chrisbanes opened 4 years ago

chrisbanes commented 4 years ago

See the following thread stack. I'm using a pretty vanilla setup for Room + Store and streaming in ViewModels (i.e. collecting on the main thread):

viewModelScope.launch {
    store.stream(StoreRequest.cached(showId, refresh = false))
        .collect { ... }
}

This might be a guidance issue, where we tell people to stream(...).flowOn(Dispatcher.Default)

Log:

"main" prio=5 tid=1 Runnable
  | group="main" sCount=1 dsCount=0 flags=9 obj=0x717a02f8 self=0xb4000070b9eec380
  | sysTid=27737 nice=0 cgrp=default sched=0/0 handle=0x71e08094f8
  | state=R schedstat=( 45250255527 4245629451 61014 ) utm=4253 stm=271 core=4 HZ=100
  | stack=0x7ff1d62000-0x7ff1d64000 stackSize=8192KB
  | held mutexes= "mutator lock"(shared held)
  at java.util.concurrent.ConcurrentHashMap.replaceNode (ConcurrentHashMap.java:1116)
  at java.util.concurrent.ConcurrentHashMap.remove (ConcurrentHashMap.java:1107)
  at com.dropbox.android.external.cache4.RealCache.evictEntries (RealCache.java:228)
  at com.dropbox.android.external.cache4.RealCache.put (RealCache.java:151)
  at com.dropbox.android.external.store4.impl.RealStore$stream$2.invokeSuspend (RealStore.java:116)
  at com.dropbox.android.external.store4.impl.RealStore$stream$2.invoke (RealStore.java)
  at kotlinx.coroutines.flow.FlowKt__TransformKt$onEach$$inlined$unsafeTransform$1$2.emit (FlowKt__TransformKt.java:134)
  at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke (SafeCollectorKt.java:15)
  at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke (SafeCollectorKt.java:1)
  at kotlinx.coroutines.flow.internal.SafeCollector.emit (SafeCollector.java:73)
  at kotlinx.coroutines.flow.internal.SafeCollector.emit (SafeCollector.java:55)
  at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke (SafeCollectorKt.java:15)
  at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke (SafeCollectorKt.java:1)
  at kotlinx.coroutines.flow.internal.SafeCollector.emit (SafeCollector.java:73)
  at kotlinx.coroutines.flow.internal.SafeCollector.emit (SafeCollector.java:55)
  at com.dropbox.android.external.store4.impl.RealStore$diskNetworkCombined$$inlined$transform$1$1.emit (RealStore.java:152)
  at kotlinx.coroutines.flow.FlowKt__ChannelsKt.emitAllImpl$FlowKt__ChannelsKt (FlowKt__ChannelsKt.java:58)
  at kotlinx.coroutines.flow.FlowKt__ChannelsKt$emitAllImpl$1.invokeSuspend (FlowKt__ChannelsKt.java)
  at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith (BaseContinuationImpl.java:33)
  at kotlinx.coroutines.DispatchedTask.run (DispatchedTask.java:56)
  at kotlinx.coroutines.EventLoop.processUnconfinedEvent (EventLoop.java:69)
  at kotlinx.coroutines.CancellableContinuationImpl.parentCancelled$kotlinx_coroutines_core (CancellableContinuationImpl.java:184)
  at kotlinx.coroutines.DispatchedTaskKt.resumeUnconfined (DispatchedTaskKt.java:184)
  at kotlinx.coroutines.DispatchedTaskKt.dispatch (DispatchedTaskKt.java:108)
  at kotlinx.coroutines.CancellableContinuationImpl.dispatchResume (CancellableContinuationImpl.java:308)
  at kotlinx.coroutines.CancellableContinuationImpl.completeResume (CancellableContinuationImpl.java:395)
  at kotlinx.coroutines.channels.AbstractChannel$ReceiveElement.completeResumeReceive (AbstractChannel.java:872)
  at kotlinx.coroutines.channels.ArrayChannel.offerInternal (ArrayChannel.java:83)
  at kotlinx.coroutines.channels.AbstractSendChannel.send (AbstractSendChannel.java:133)
  at kotlinx.coroutines.channels.ChannelCoroutine.send$suspendImpl (ChannelCoroutine.java:1)
  at kotlinx.coroutines.channels.ChannelCoroutine.send (ChannelCoroutine.java:1)
  at kotlinx.coroutines.flow.internal.SendingCollector.emit (SendingCollector.java:19)
  at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke (SafeCollectorKt.java:15)
  at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke (SafeCollectorKt.java:1)
  at kotlinx.coroutines.flow.internal.SafeCollector.emit (SafeCollector.java:73)
  at kotlinx.coroutines.flow.internal.SafeCollector.emit (SafeCollector.java:55)
  at com.dropbox.android.external.store4.impl.SourceOfTruthWithBarrier$reader$1$invokeSuspend$$inlined$flatMapLatest$1$lambda$1$1.emit (SourceOfTruthWithBarrier.java:153)
  at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke (SafeCollectorKt.java:15)
  at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke (SafeCollectorKt.java:1)
  at kotlinx.coroutines.flow.internal.SafeCollector.emit (SafeCollector.java:73)
  at kotlinx.coroutines.flow.internal.SafeCollector.emit (SafeCollector.java:55)
  at androidx.room.CoroutinesRoom$Companion$createFlow$1$1$1.invokeSuspend (CoroutinesRoom.java:82)
  at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith (BaseContinuationImpl.java:33)
  at kotlinx.coroutines.DispatchedTask.run (DispatchedTask.java:56)
  at android.os.Handler.handleCallback (Handler.java:938)
  at android.os.Handler.dispatchMessage (Handler.java:99)
  at android.os.Looper.loop (Looper.java:223)
  at android.app.ActivityThread.main (ActivityThread.java:7656)
  at java.lang.reflect.Method.invoke (Method.java)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:592)
  at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:947)

Smartphone:

digitalbuddha commented 4 years ago

Cc @ychescale9 any thoughts?

ychescale9 commented 4 years ago

The cache module is not aware of Android's main thread and the operations are blocking not suspended as the library doesn't have kotlinx.coroutines dependency.

If we want to avoid asking user to add a flowOn(coroutineContext) to the generated Flow, perhaps the RealStore could take an additional cacheDispatcher: CoroutineDispatacher = Dispatchers.Default constructor param so we can wrap memCache operations in the RealStore with withContext(cacheDispatcher).

We could also launch a new coroutine using the injected scope which by default is GlobalScope but I think that scope is dedicated for the Multicaster?

@yigit what do you think?

digitalbuddha commented 4 years ago

I think this is a documentation issue. My expectation is that a store is synchronous/blocking. We should document properly that a flowOn is needed if using from a main dispatcher. Thank you for the report Chris!

eyalgu commented 4 years ago

I'm not sure if this will solve the issue (as I'm not sure if this is a true deadlock or long running eviction triggering the ANR) but cache.put shouldn't be done on the user's scope but the store's scope as it's an internal store maintenance operation.

eyalgu commented 4 years ago

ah I forgot how hard flow would make this. I'll take a stab at it but it may not be worth the squeeze

digitalbuddha commented 4 years ago

@chrisbanes what is your expectation here? Do you expect store to always run on background and return result on caller thread?

Coming from rxworld I'd expect a withContext(iO()){store.stream(key)} if the execution needs to be on a certain context

eyalgu commented 4 years ago

I'm not sure I 100% agree here. Flow's context preservation + the face that store takes in a scope means we can move all our operations off the user's scope without requiring that the user modify the dispatcher

On Sun, Aug 9, 2020, 2:42 PM Mike Nakhimovich notifications@github.com wrote:

@chrisbanes https://github.com/chrisbanes what is your expectation here? Do you expect store to always run on background and return result on caller thread?

Coming from rxworld I'd expect a withContext(iO()){store.stream(key)} if the execution needs to be on a certain context

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dropbox/Store/issues/182#issuecomment-671104452, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMUYKL7K7UJG7OONQFZHJDR74J3HANCNFSM4OFTWDLA .

chrisbanes commented 4 years ago

I can see either way tbh. I worry a little bit about adding more magic config to each generated Store. I think just having some clear examples in the docs would be enough.