This is a minor, non-user-aware fix so I've not created a GtHub issue. A detailed explanation is provided in the next section.
Rationale for this change
I had this small patch in my local workspace for a while, I think it is useful for other comet developers as well. The changes are as follows:
Fix the library name when not loading from the JAR bundle. The documentation of System.loadLibrary states that "The libname argument must not contain any platform specific prefix, file extension or path", so the library name should be comet instead of libcomet.so or libcomet.dylib. This fix makes it easier to load the newly built comet library in the native/target directory when running Java tests.
Handle exceptions without error messages properly in native code. When there's an NPE in Scala code, the NPE exception raised by JVM may not have an error message. The native code will panic with the following error message:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (192.168.200.154 executor driver): org.apache.comet.CometNativeException: General execution error with reason: Null pointer in get_object_class.
at org.apache.comet.Native.executePlan(Native Method)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:107)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1$adapted(CometExecIterator.scala:106)
at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:157)
at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:106)
at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:118)
at org.apache.comet.CometExecIterator.next(CometExecIterator.scala:135)
It does not point to the source of the NPE, which makes troubleshooting difficult. This patch checks if the error message retrieved by the .getMessage JNI call is null and handles it specially. The error message becomes:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (192.168.200.154 executor driver): java.lang.NullPointerException
at org.apache.comet.CometNativeSuite$$anon$1.next(CometNativeSuite.scala:37)
at org.apache.comet.CometNativeSuite$$anon$1.next(CometNativeSuite.scala:35)
at org.apache.comet.CometBatchIterator.next(CometBatchIterator.java:56)
at org.apache.comet.Native.executePlan(Native Method)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:107)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1$adapted(CometExecIterator.scala:106)
at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:157)
at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:106)
at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:118)
at org.apache.comet.CometExecIterator.next(CometExecIterator.scala:135)
What changes are included in this PR?
This PR includes fixes for the minor problems mentioned above.
Which issue does this PR close?
This is a minor, non-user-aware fix so I've not created a GtHub issue. A detailed explanation is provided in the next section.
Rationale for this change
I had this small patch in my local workspace for a while, I think it is useful for other comet developers as well. The changes are as follows:
System.loadLibrary
states that "The libname argument must not contain any platform specific prefix, file extension or path", so the library name should becomet
instead oflibcomet.so
orlibcomet.dylib
. This fix makes it easier to load the newly built comet library in thenative/target
directory when running Java tests.It does not point to the source of the NPE, which makes troubleshooting difficult. This patch checks if the error message retrieved by the
.getMessage
JNI call is null and handles it specially. The error message becomes:What changes are included in this PR?
This PR includes fixes for the minor problems mentioned above.
How are these changes tested?
Add a unit test for problem 2.