DataDog / dd-sdk-android

Datadog SDK for Android (Compatible with Kotlin and Java)
Apache License 2.0
140 stars 56 forks source link

ANR on `Rum.enable()` #1926

Open fchristysen opened 3 months ago

fchristysen commented 3 months ago

Stack trace

  at com.datadog.android.core.internal.persistence.file.advanced.FeatureFileOrchestrator.<init> (FeatureFileOrchestrator.java:47)
  at com.datadog.android.core.internal.SdkFeature.createFileStorage (SdkFeature.java:262)
  at com.datadog.android.core.internal.SdkFeature.prepareStorage (SdkFeature.java:235)
  at com.datadog.android.core.internal.SdkFeature.initialize (SdkFeature.java:80)
  at com.datadog.android.core.DatadogCore.registerFeature (DatadogCore.kt:134)
  at com.datadog.android.rum.Rum.enable (Rum.java:61)
  at com.datadog.android.rum.Rum.enable$default (Rum.java:35)
  at com.traveloka.android.datadog.DatadogUtil.initializeRUM (DatadogUtil.kt:106)
...

Reproduction steps

Initialized Datadog RUM by calling Rum.enable(config)

Volume

0.9%

Affected SDK versions

2.5.1

Latest working SDK version

2.5.1

Does the crash manifest in the latest SDK version?

Yes

Kotlin / Java version

No response

Gradle / AGP version

No response

Other dependencies versions

No response

Device Information

No response

Other relevant information

No response

mariusc83 commented 3 months ago

Hi @fchristysen, thank you for taking the time for reporting this issue. I am afraid we will need more information from your end in order to better understand what is going on. Can you please provide us with the following information :

Thanks !!

fchristysen commented 3 months ago

Hi @mariusc83, Thanks for quick reply.

a complete stack trace of the ANR error

com.traveloka.android.datadog.DatadogUtil.initializeRUM
Input dispatching timed out (No focused window)
at com.datadog.android.rum.internal.RumFeature.createDataWriter (RumFeature.kt:241)
at com.datadog.android.rum.internal.RumFeature.onInitialize (RumFeature.kt:142)
at com.datadog.android.core.internal.SdkFeature.initialize (SdkFeature.java:89)
at com.datadog.android.core.DatadogCore.registerFeature (DatadogCore.kt:134)
at com.datadog.android.rum.Rum.enable (Rum.java:61)
at com.datadog.android.rum.Rum.enable$default (Rum.java:35)
at com.traveloka.android.datadog.DatadogUtil.initializeRUM (DatadogUtil.kt:107)
at com.traveloka.android.AppTravelokaApplicationListener.lambda$setupDatadogSDK$7 (AppTravelokaApplicationListener.java:272)
at rx.internal.util.ActionSubscriber.onNext (ActionSubscriber.java:39)
at rx.observers.SafeSubscriber.onNext (SafeSubscriber.java:134)
at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.call (OperatorObserveOn.java:224)
at rx.android.schedulers.LooperScheduler$ScheduledAction.run (LooperScheduler.java:107)
at android.os.Handler.handleCallback (Handler.java:938)
at android.os.Handler.dispatchMessage (Handler.java:99)
at android.os.Looper.loop (Looper.java:264)
at android.app.ActivityThread.main (ActivityThread.java:8315)
at java.lang.reflect.Method.invoke (Native method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:632)
at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1049)

how often this happens, what are the devices model and maybe some scenario through which you are able to reproduce the issue?

Based on our play store metrics, 0.9% sessions of our users. We put our RUM.enable on our app startup, on our Application class onCreate.

mariusc83 commented 3 months ago

Thank you for the stack trace and the related information. We will add a ticket on our end to investigate this and keep you updated on this thread.

0xnm commented 2 months ago

Hello @fchristysen! We have additional questions to continue our investigations:

fchristysen commented 2 months ago

Hi @0xnm , sorry for late reply

mariusc83 commented 3 weeks ago

Hi @fchristysen, sorry for the delay. We were not able to reproduce this issue on our end so far and we could not see it yet in our metrics. Looking in the provided stack trace I see that you are using RxAndroid and that you are initializing the Datadog SDK in a RX scheduler. The way I remember is that the stack traces in RX can be scattered due to the multi - threading chain of operation calls. I am wondering if this could be really caused by our SDK or it could be something else ? Is the initialize method called on a rx.android.schedulers.LooperScheduler running on the UI Thread ?