ddf-project / DDF

Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data Engine
http://ddf.io
Apache License 2.0
168 stars 42 forks source link

bin/pyddf examples/basics.py fails #64

Open josephwinston opened 9 years ago

josephwinston commented 9 years ago

The exception is: This UDAF does not support the deprecated getEvaluator() method.

System:

$ uname -a
Darwin jw-macbook-pro 14.3.0 Darwin Kernel Version 14.3.0: Thu Feb 12 18:38:33 PST 2015; root:xnu-2782.20.34~3/RELEASE_X86_64 x86_64
$ python --version
Python 2.7.9

traceback:

Traceback (most recent call last):
  File "examples/basics.py", line 25, in <module>
    ddf.getFiveNumSummary()
  File "/Users/jbw/work/CWP/DDF/python/package/ddf/DDF.py", line 69, in getFiveNumSummary
    return self._jddf.getFiveNumSummary()
  File "/Users/jbw/work/CWP/DDF/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
  File "/Users/jbw/work/CWP/DDF/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o5.getFiveNumSummary.
: io.ddf.exception.DDFException: Unable to get fivenum summary of the given columns from table SparkDDF_spark_8f97376f_5de4_4d31_bc8a_6a9455418742
    at io.ddf.DDF.sql2txt(DDF.java:324)
    at io.ddf.analytics.AStatisticsSupporter.getFiveNumSummary(AStatisticsSupporter.java:60)
    at io.ddf.DDF.getFiveNumSummary(DDF.java:912)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:207)
    at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: This UDAF does not support the deprecated getEvaluator() method.
    at org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver.getEvaluator(AbstractGenericUDAFResolver.java:53)
    at org.apache.spark.sql.hive.HiveGenericUdaf.objectInspector$lzycompute(hiveUdfs.scala:182)
    at org.apache.spark.sql.hive.HiveGenericUdaf.objectInspector(hiveUdfs.scala:181)
    at org.apache.spark.sql.hive.HiveGenericUdaf.dataType(hiveUdfs.scala:189)
    at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:94)
    at org.apache.spark.sql.catalyst.plans.logical.Aggregate$$anonfun$output$6.apply(basicOperators.scala:141)
    at org.apache.spark.sql.catalyst.plans.logical.Aggregate$$anonfun$output$6.apply(basicOperators.scala:141)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at org.apache.spark.sql.catalyst.plans.logical.Aggregate.output(basicOperators.scala:141)
    at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$unapply$1.apply(patterns.scala:61)
    at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$unapply$1.apply(patterns.scala:61)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.sql.catalyst.planning.PhysicalOperation$.unapply(patterns.scala:61)
    at org.apache.spark.sql.execution.SparkStrategies$ParquetOperations$.apply(SparkStrategies.scala:209)
    at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
    at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
    at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
    at org.apache.spark.sql.catalyst.planning.QueryPlanner.apply(QueryPlanner.scala:59)
    at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:383)
    at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:381)
    at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:387)
    at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:387)
    at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:454)
    at io.spark.ddf.etl.SqlHandler.sql2txt(SqlHandler.java:176)
    at io.ddf.DDFManager.sql2txt(DDFManager.java:392)
    at io.ddf.DDFManager.sql2txt(DDFManager.java:387)
    at io.ddf.DDFManager.sql2txt(DDFManager.java:382)
    at io.ddf.DDF.sql2txt(DDF.java:322)
    ... 13 more
khangich commented 9 years ago

@nhanitvn can you take a look ?

ctn commented 9 years ago

What's happening here @nhanitvn? :)