Open doublebyte1 opened 5 years ago
Thanks for reporting this. I assume "version 2.0.0" refers to Spatial Framework for Hadoop. Please let us know the versions of Hive and Hadoop.
@randallwhitman Hadoop 2.8.5, Hive 2.3.4
Thanks for the details. We do not have Hive-2.3.4 (nor Hadoop-2.8.5) installed, and unfortunately the testing framework has fallen behind - for which I went ahead and filed Esri/spatial-framework-for-hadoop#163. Maybe it will reproduce with another version of Hive or with SparkSql.
Ok. If it does not reproduce, let me know what is the highest version of Hadoop/Hive you have installed!
Under Spark-SQL-2.2.0 SELECT ST_Bin(0.001, ST_Point(0,0));
output a large integer without error.
I wonder if the exception has anything to do with data in a header row.
Note dropoff_latitude
is an argument to ST_Point
, not directly to ST_Bin
.
@doublebyte1 Are you able to try out the query on a copy of trip_data_1.csv
that omits the header row (first line), and see what happens?
@randallwhitman yes, I confirm that the error reproduces on a copy of trip_data_1.csv, without a header.
Under Spark-SQL-2.2.0
SELECT ST_Bin(0.001, ST_Point(0,0));
output a large integer without error.
The output of this query on hive 2.3.4. is:
hive> SELECT ST_Bin(0.001, ST_Point(0,0)); FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments '0': Argument 0 must be a number
Thanks for the additional details. That it reproduces with a minimal query could be very helpful - it might be possible to reproduce without a full installation.
I'm curious what Hive-2.3 will do on the following:
SELECT ST_Bin(1e-3, ST_Point(0,0));
SELECT ST_Bin(1.0-0.999, ST_Point(0,0));
SELECT ST_Bin(0.001, ST_Point("Point(0 0)"));
Hive-0.14/HDP-2.2.6 :
hive> SELECT ST_Bin(1e-3, ST_Point(0,0));
4611686015463124500
hive> SELECT ST_Bin(1.0-0.999, ST_Point(0,0));
4611686015463124500
hive> SELECT ST_Bin(0.001, ST_Point("Point(0 0)"));
4611686015463124500
Spark-SQL-2.2 :
spark-sql> SELECT ST_Bin(1e-3, ST_Point(0,0));
4611686015463124500
Time taken: 0.097 seconds, Fetched 1 row(s)
spark-sql> SELECT ST_Bin(1.0-0.999, ST_Point(0,0));
Time taken: 0.296 seconds, Fetched 1 row(s)
spark-sql> SELECT ST_Bin(0.001, ST_Point("Point(0 0)"));
4611686015463124500
Time taken: 0.085 seconds, Fetched 1 row(s)
Hive 2.3:
hive> SELECT ST_Bin(1e-3, ST_Point(0,0));
4611686015463124500
hive> SELECT ST_Bin(1.0-0.999, ST_Point(0,0));
FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments '0': Argument 0 must be a number
hive> SELECT ST_Bin(0.001, ST_Point("Point(0 0)"));
FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments '"Point(0 0)"': Argument 0 must be a number
hive>
Differing between 0.001
versus 1e-3
is really strange and makes we wonder if there's an issue with Hive's parser.
And/or maybe some new ObjectInspector
types have been introduced in Hive API.
and missing from - https://github.com/Esri/spatial-framework-for-hadoop/blob/master/hive/src/main/java/com/esri/hadoop/hive/ST_Bin.java#L37..L44
PrimitiveCategory.DECIMAL
maybe?
Per skimming https://hive.apache.org/javadocs/r2.3.5/api/org/apache/hadoop/hive/serde2/objectinspector/PrimitiveObjectInspector.PrimitiveCategory.html , https://hive.apache.org/javadocs/r2.3.5/api/org/apache/hadoop/hive/serde2/objectinspector/primitive/package-summary.html
@doublebyte1 Are you able to test the linked branch Esri/spatial-framework-for-hadoop#164 ?
@randallwhitman unfortunately, I am not working with Hadoop anymore. I am not able to build this branch.
Today I got Hive-2.3.5 to run, and reproduced:
hive> SELECT ST_Bin(1.0-0.999, ST_Point(0,0));
FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments '0': Argument 0 must be a number
I am trying to run the tutorial on this page. However, this instruction throws an error:
FROM (SELECT ST_Bin(0.001, ST_Point(dropoff_longitude,dropoff_latitude)) bin_id, *FROM taxi_demo) bins SELECT ST_BinEnvelope(0.001, bin_id) shape, COUNT(*) count GROUP BY bin_id;
FAILED: SemanticException [Error 10014]: Line 1:13 Wrong arguments 'dropoff_latitude': Argument 0 must be a number
It seems that it is throwing an error on the initialization of the object, but I cannot understand why.
Also tried replacin "0.001" by ".001", and got the same error. Using an integer ("1"), or the result of a division ("1/1000") works, but it gives me unexpected results. For instance, when I use:
FROM (SELECT ST_Bin(1/1000, ST_Point(dropoff_longitude,dropoff_latitude)) bin_id, *FROM taxi_demo) bins SELECT ST_BinEnvelope(1/1000, bin_id) shape, COUNT(*) count GROUP BY bin_id;
The coordinates of the resulting shape are invalid lat, long (EPSG:4326):
{"type":"Polygon","coordinates":[[[-0.20249999989755452,3210.3374999998377],[-0.20149999989755452,3210.3374999998377],[-0.20149999989755452,3210.338499999838],[-0.20249999989755452,3210.338499999838],[-0.20249999989755452,3210.3374999998377]]],"crs":null} 1.0 {"type":"Polygon","coordinates":[[[-40.75249999994412,1727.0164999998417],[-40.75149999994412,1727.0164999998417],[-40.75149999994412,1727.0174999998417],[-40.75249999994412,1727.0174999998417],[-40.75249999994412,1727.0164999998417]]],"crs":null} 1.0 {"type":"Polygon","coordinates":[[[-40.719499999890104,1651.5524999999218],[-40.718499999890106,1651.5524999999218],[-40.718499999890106,1651.5534999999218],[-40.719499999890104,1651.5534999999218],[-40.719499999890104,1651.5524999999218]]],"crs":null} 1.0 {"type":"Polygon","coordinates":[[[-73.98149999999441,400.7774999997821],[-73.98049999999441,400.7774999997821],[-73.98049999999441,400.77849999978207],[-73.98149999999441,400.77849999978207],[-73.98149999999441,400.7774999997821]]],"crs":null} 1.0 {"type":"Polygon","coordinates":[[[-0.0004999998491257429,74.0134999998156],[0.0005000001508742571,74.0134999998156],[0.0005000001508742571,74.0144999998156],[-0.0004999998491257429,74.0144999998156],[-0.0004999998491257429,74.0134999998156]]],"crs":null} 1.0 {"type":"Polygon","coordinates":[[[-0.0004999998491257429,73.98749999997764],[0.0005000001508742571,73.98749999997764],[0.0005000001508742571,73.98849999997765],[-0.0004999998491257429,73.98849999997765],[-0.0004999998491257429,73.98749999997764]]],"crs":null} 1.0 {"type":"Polygon","coordinates":[[[-0.10349999996833503,73.98249999985657],[-0.10249999996833503,73.98249999985657],[-0.10249999996833503,73.98349999985658],[-0.10349999996833503,73.98349999985658],[-0.10349999996833503,73.98249999985657]]],"crs":null} 1.0 {"type":"Polygon","coordinates":[[[-0.0004999998491257429,73.97849999989941],[0.0005000001508742571,73.97849999989941],[0.0005000001508742571,73.97949999989942],[-0.0004999998491257429,73.97949999989942],[-0.0004999998491257429,73.97849999989941]]],"crs":null} 3.0 {"type":"Polygon","coordinates":[[[-0.4684999999590218,73.97449999994225],[-0.4674999999590218,73.97449999994225],[-0.4674999999590218,73.97549999994226],[-0.4684999999590218,73.97549999994226],[-0.4684999999590218,73.97449999994225]]],"crs":null} 1.0 {"type":"Polygon","coordinates":[[[-0.0004999998491257429,73.96049999997578],[0.0005000001508742571,73.96049999997578],[0.0005000001508742571,73.96149999997579],[-0.0004999998491257429,73.96149999997579],[-0.0004999998491257429,73.96049999997578]]],"crs":null} 3.0
Version 2.0.0