akaita / RxJava2Debug

RxJava 2.x extension to provide meaningful Stack Traces
Apache License 2.0
657 stars 30 forks source link

NullPointerException in ExceptionUtils #2

Closed HappyDr0id closed 6 years ago

HappyDr0id commented 7 years ago

Hello,

I just got a NullPointerException crash with this stacktrace :

Fatal Exception: java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.Object java.lang.StackTraceElement[].clone()' on a null object reference
       at com.akaita.java.rxjava2debug.ExceptionUtils.collapseCauses(SourceFile:60)
       at com.akaita.java.rxjava2debug.ExceptionUtils.setRootCause(SourceFile:37)
       at com.akaita.java.rxjava2debug.RxJava2Debug.getEnhancedStackTrace(SourceFile:74)

It seems that there is an issue in the collapseCauses method of the ExceptionUtils class.

Thanks in advance for your answer !

akaita commented 6 years ago

Hi @HappyDr0id,

I see... The error actually makes sense. I'll get a fix ready sometime this weekend.

Thanks so much for the report!

akaita commented 6 years ago

Java specifies that getStackTrace()[ should always return an array](https://docs.oracle.com/javase/7/docs/api/java/lang/Throwable.html#getStackTrace()), even if the JVM provided no stacktrace. You have apparently stumbled upon a system that doesn't follow the spec. I'll add some protection around this.

Out of curiosity and to help others that might find the same issue, what system where you running on?

akaita commented 6 years ago

Release 1.2.1 should be able to handle the issue 👍

HappyDr0id commented 6 years ago

This crash was reproduced on a LG G4 on Android 6.0, thanks for your fix !

akaita commented 6 years ago

Thank you for the report!

ghost commented 6 years ago

Hello @akaita,

I am using the 1.2.1 of your library, which is supposed to have fixed this issue, but the NPE still occurs. Here is the stacktrace of the crash :

Fatal Exception: java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.Object java.lang.Object.clone()' on a null object reference
       at com.akaita.java.rxjava2debug.ExceptionUtils.collapseCauses(SourceFile:60)
       at com.akaita.java.rxjava2debug.ExceptionUtils.setRootCause(SourceFile:37)
       at com.akaita.java.rxjava2debug.RxJava2Debug.getEnhancedStackTrace(SourceFile:74)

It happens on a Moto X (2nd Gen) running Android 6 and LG G4 running Android 6 too.

It seems that the original issue wasn't that the array returned by .getStackTrace() was null, but rather that calling .getStackTrace() can result in a NPE depending on the JVM implementation.

akaita commented 6 years ago

Hi @mesnault ,

Mmm. I see some comments in StackOverflow about it, but no mention about it in any official doc... I'll have a better look at it.

Do you have any reference to validate the theory? Is the issue replicable in your device?

ghost commented 6 years ago

Unfortunately, I can't test the behavior of collapseCauses directly since the crash doesn't occur on devices I own, and I only saw the reports coming from Crashlytics.

Based on the Throwable.java source from Android, nothing wrong should ever happen as they return an empty array defined here if the stacktrace is null...

I also see crash reports coming from more devices:

So it's unlikely to be linked to a particular OEM modification. However, I see some stacktraces that differ from the two already described in this issue:

Fatal Exception: java.lang.NullPointerException
       at java.lang.Throwable.getStackTrace(Throwable.java:200)
       at com.akaita.java.rxjava2debug.ExceptionUtils.collapseCauses(SourceFile:60)
       at com.akaita.java.rxjava2debug.ExceptionUtils.setRootCause(SourceFile:37)
       at com.akaita.java.rxjava2debug.RxJava2Debug.getEnhancedStackTrace(SourceFile:74)

For further informations, the project is in Kotlin, and the initialization of your library is done in a application delegate with this order:

akaita commented 6 years ago

It is quite odd. I've been running the lib in over 1 million devices for months and I don't get such an error. In my setup there is no RxJavaPlugins handler initialisation, but I highly doubt it has anything to do with the crash.

I have a theory, but I need a couple of bits of info from you @mesnault :

  1. could you check whether those devices run Android 6?
  2. does your app include any native code?
  3. do any of the libraries you use include native code?
ghost commented 6 years ago

Thank you for taking the time to look into this.

If you don't have any trouble elsewhere, it must be linked to the configuration of the project, then. I double checked the configuration process but it follows your Readme.

The devices affected by the crash are running either 5.1.1 (5%), 6.0.1 (25%) or 4.4.4 (70%) Note that we don't have many data points behind those crash reports, as it is a small app, and we removed the library temporarily to avoid the crash.

Our app does not include native code directly. I think it might from one of those libraries, but I'm not sure:

I don't think other libraries used in the app have native, but I could be wrong. What are the implications that you are thinking of?

I might get my hands on one device that reproduces the issue at the start of next week, I'll keep you updated if I find anything.

akaita commented 6 years ago

I was thinking that the only chance for the NPE to happen would be for native code to return a null stack in Android 6 or earlier: https://android.googlesource.com/platform/libcore.git/+/android-6.0.1_r81/luni/src/main/java/java/lang/Throwable.java#264

The only difference I se between your setup and mine is that I don't use RxJavaPlugins, but I do use them in other projects with fewer users...

If you could send me the name of your app (through email, if you will), I'll be happy to try it in a Motorola Moto G with 4G LTE (2nd Gen) which might be close enough to replicate.

ghost commented 6 years ago

Nice catch, I totally missed the fact that the crash only occurs on versions prior to Android 6.

I was able to reproduce the issue on a device running 4.4.4. I can confirm that the crash occurs when .getStackTrace() is called on one of the causes in collapseCauses, nativeGetStackTrace(stackState) returns null, thus the clone fails with a NPE.

The collapseCauses method is called with this data:

[io.reactivex.rxkotlin.OnErrorNotImplementedException: MonitoringError(code=8, message=null), MonitoringError(code=8, message=null), hu.akarnokd.rxjava2.debug.RxJavaAssemblyException, hu.akarnokd.rxjava2.debug.RxJavaAssemblyException, java.lang.Throwable]

MonitoringError is a custom Exception we have, which only subclasses Exception.

getStackTrace() crashes while processing the last RxJavaAssemblyException of the array (the first RxJavaAssemblyException that gets processed).

This exception overrides the fillInStackTrace() method that is needed for nativeGetStackTrace() to not return null based on the implementation I found for Android 4.3 https://android.googlesource.com/platform/dalvik.git/+/android-4.3_r3/vm/native/java_lang_Throwable.cpp I did not find the code for nativeGetStackTrace() for the android-6.0.1_r81 platform, so it might not be the same between 4.3 and 6.0.1.

Here is a sample repository that reproduces the issue : https://github.com/mesnault/RxJava2Debug_Issue2

akaita commented 6 years ago

Good, we are in the good track. It feels like this is really about Android misbehaving, but RxJava2Debug should still handle it...

Now that is it clearly identified (thanks so much for the repo!), it is time to think about how to best solve the issue 🤔
I have a couple of alternatives in mind. Let me test them a bit.

akaita commented 6 years ago

I just released 1.2.2 with a fix.

@mesnault Thanks so much for your help. You were crucial in order to solve the issue 👏 👏 👏

ghost commented 6 years ago

Nice work, I'm glad that you fixed this potential crash in RxJava2Extensions as well :+1:

Thank you for your time and this library.