Esri / spatial-framework-for-hadoop

The Spatial Framework for Hadoop allows developers and data scientists to use the Hadoop data processing system for spatial data analysis.
Apache License 2.0
363 stars 160 forks source link

GeoJsonInputFormat with Hive-0.14 NullPointerException #99

Closed randallwhitman closed 7 years ago

randallwhitman commented 8 years ago
$ hive  --hiveconf hive.root.logger=WARN,console
-- ...
hive> select rowid, st_astext(shape) from randall.test15gj2 limit 4;
OK
Failed with exception java.io.IOException:java.lang.NullPointerException
16/03/09 17:24:48 [main]: ERROR CliDriver: Failed with exception java.io.IOException:java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:663)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561)
-- ...
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.mapred.TextInputFormat.isSplitable(TextInputFormat.java:49)
        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:343)
        at com.esri.json.hadoop.UnenclosedGeoJsonInputFormat.getSplits(UnenclosedGeoJsonInputFormat.java:40)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:442)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:588)
        ... 15 more

Not seen on hive-.13 version. Note: Hive (both .13 & .14) is using the MRv1 API of the InputFormat. Should also mention that GeoJsonInputFormat had passed integration testing with a custom MapReduce application - that exercised MRv2 rather than MRv1.

randallwhitman commented 8 years ago

I have a patch in development - need to prepare commit.

randallwhitman commented 8 years ago

Hive-.13: TextInputFormat => OK, FileInputFormat => OK. Hive-.14: TextInputFormat => NPE, FileInputFormat => OK.

Not sure whether to characterize this as a bug in UnenclosedGeoJsonInputFormat, or/versus, a bug in Hive-.14; fortunately the anonymous class extending FileInputFormat<K,V> works in both.