npgall / mobility-rpc

Add Code Mobility to any application
Apache License 2.0
32 stars 12 forks source link

Exception stack printing problem #11

Open Adam5Wu opened 8 years ago

Adam5Wu commented 8 years ago

When a remote execution exception happened, the exception object transferred back will trigger a JDK NullPointerException when calling the printStackTrace() function.

This is because the getSuppressed() function had an assumption, that if there is no suppressed exception, the field "suppressedExceptions" should either be null, or identity equal to SUPPRESSED_SENTINEL, which is an statically defined empty array. And this field was initialized with SUPPRESSED_SENTINEL.

But when the exception got serialized, transported and deserialized, the deserialized exception's suppressedExceptions contains a reserialized empty array, which is NOT the same identity as the local SUPPRESSED_SENTINEL instance.

This causes getSuppressed() to think there are suppressed exceptions, and calls suppressedExceptions.toArray(), which triggers NPE on an empty array.

I think some special handling is needed for this particular field. Either: During serialization, detect the SUPPRESSED_SENTINEL, and serialize a null; Or: After deserialization, detect empty array, and replace with null or SUPPRESSED_SENTINEL.

npgall commented 8 years ago

Hi Zhenyu,

Can you provide a standalone test case which shows the problem, so I can take a look?

Thanks, Niall

On 26 Jan 2016, at 01:28, Zhenyu Wu notifications@github.com wrote:

When a remote execution happened, the exception transferred back will trigger a JDK NullPointerException when calling the printStackTrace() function

The problem is because the getSuppressed() function had an assumption, that if there is no suppressed exception, the field "suppressedExceptions" should either be null, or identity equal to SUPPRESSED_SENTINEL, which is an statically defined empty array And this field was initialized with SUPPRESSED_SENTINEL

But when the exception got serialized, transported and deserialized, the deserialized exception's suppressedExceptions contains a reserialized empty array, which is NOT the same identity as the local SUPPRESSED_SENTINEL instance

This causes getSuppressed() is think there is suppressed exception, and calls suppressedExceptionstoArray(), which triggered NPE on an empty array

I think some special handling is needed for this particular field Either: During serialization, detect the SUPPRESSED_SENTINEL, and serialize a null; Or: After deserialization, detect empty array, and replace with null or SUPPRESSED_SENTINEL

— Reply to this email directly or view it on GitHub https://github.com/npgall/mobility-rpc/issues/11.

Adam5Wu commented 8 years ago

Actually, I found another issue while preparing for the test case. Will open another ticket later. Here is the minimum test I can reproduce the error. (BTW using Java 1.8 u66)

public class TestClient {
    static class TestRun implements Runnable {
        @Override
        public void run() {
            throw new Error("Test");
        }
    };
    public static void main(String[] argv) {
        try {
            QuickTask.execute("LocalHost", new TestRun());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

The exception stack trace:

Exception in thread "main" java.lang.NullPointerException
    at java.lang.System.arraycopy(Native Method)
    at java.util.ArrayList.toArray(Unknown Source)
    at java.util.Collections$UnmodifiableCollection.toArray(Unknown Source)
    at java.lang.Throwable.getSuppressed(Unknown Source)
    at java.lang.Throwable.printEnclosedStackTrace(Unknown Source)
    at java.lang.Throwable.printStackTrace(Unknown Source)
    at java.lang.Throwable.printStackTrace(Unknown Source)
    at java.lang.Throwable.printStackTrace(Unknown Source)
    at TestClient.main(TestClient.java:19)
npgall commented 8 years ago

Any reason you throw new Error() here? Does this issue only affect Errors, or Exceptions as well?

Mobility-RPC uses Kryo for serialization, so anything Kryo can serialize, or not serialize will apply to Mobility-RPC too. Fortunately Kryo is very customizable and there are usually ways to work around such things.

If you get a chance (before I get to it) you could try to serialize and deserialize these throwables with Kryo directly (independent of Mobility-RPC). Then make a pull request or suggestions on how I could fix my use of Kryo in my KryoSerializer class. It would help! But I will try to get to this soon.

npgall commented 8 years ago

Discussion on Kryo mailing list, possibly related: https://groups.google.com/forum/#!topic/kryo-users/9xOngIaZdO4/discussion

Adam5Wu commented 8 years ago

The Error() is just a demo, and this problem affects all Exception classes.

I discovered this problem when I was testing on my other project. The original exception I had was ClassNotFound -- and that was my own fault. But I was caught on the NPE exception while trying to print the original CNF exception's stack trace.

I will check out that discussion, thanks for the pointer! :D

npgall commented 8 years ago

Yes I can imagine that getting the NPE would not help with troubleshooting! Thanks for creating this issue and for the test case, it's certainly something I want to fix.

Adam5Wu commented 8 years ago

I think that discussion kind of missed the point. The problem is not because that field didn't get serialized property -- it did serialize and de-serialize perfectly!

The problem is really context dependent. This particular field, suppressedExceptions, contains a special value, SUPPRESSED_SENTINEL, whose instance identity has special meaning to the JRE. And this special meaning didn't (and couldn't be) picked up by any generic serializer.

You can kind of picture the SUPPRESSED_SENTINEL sort of like an Enum element. Each element has one and only one existence in the entire JVM, so that operation such as A == B, and swith statement makes sense. If the deserializer creates a new instance of an enum element each time, then most logic with enum will fail.

However, in the case of SUPPRESSED_SENTINEL, there is no way a generic (de)serializer could figure out it is used like an enum, and therefore any empty list should be canonicalized as SUPPRESSED_SENTINEL. Only an intelligent human being, after reading the source code, could tell.

I guess the fix must resort to a case-by-case domain knowledge injection. Hopefully there isn't too many such cases in the JDK. ... :P