apache / sedona

A cluster computing framework for processing large-scale geospatial data
https://sedona.apache.org/
Apache License 2.0
1.88k stars 662 forks source link

ST_DWithin expects 3 arguments #1567

Closed joaofauvel closed 3 weeks ago

joaofauvel commented 1 month ago

Expected behavior

SELECT ST_DWithin(ST_GeomFromWKT("POINT(-122.335167 47.608013)"), ST_GeomFromWKT("POINT(-73.935242 40.730610)"), 4000000, true)

true

ST_DWithin(
    ST_GeomFromWKT("POINT(-122.335167 47.608013)"), 
    ST_GeomFromWKT("POINT(-73.935242 40.730610)"), 
    lit(4000000), 
    use_sphere=True,
)
Column<'st_dwithin(st_geomfromwkt(POINT(-122.335167 47.608013)), st_geomfromwkt(POINT(-73.935242 40.730610)), 4000000), true'>

Actual behavior

SELECT ST_DWithin(ST_GeomFromWKT("POINT(-122.335167 47.608013)"), ST_GeomFromWKT("POINT(-73.935242 40.730610)"), 4000000, true)
IllegalArgumentException ``` IllegalArgumentException: function ST_DWithin takes at most 3 argument(s), 4 argument(s) specified at org.apache.sedona.sql.UDF.Catalog$.$anonfun$function$2(Catalog.scala:310) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistryBase.lookupFunction(FunctionRegistry.scala:251) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistryBase.lookupFunction$(FunctionRegistry.scala:245) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:317) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.$anonfun$resolveBuiltinOrTempFunctionInternal$1(SessionCatalog.scala:2835) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.lookupTempFuncWithViewContext(SessionCatalog.scala:2857) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.resolveBuiltinOrTempFunctionInternal(SessionCatalog.scala:2835) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.resolveBuiltinOrTempFunction(SessionCatalog.scala:2812) at org.apache.spark.sql.catalyst.catalog.DelegatingSessionCatalog.resolveBuiltinOrTempFunction(DelegatingSessionCatalog.scala:529) at org.apache.spark.sql.catalyst.catalog.DelegatingSessionCatalog.resolveBuiltinOrTempFunction$(DelegatingSessionCatalog.scala:526) at com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.resolveBuiltinOrTempFunction(ManagedCatalogSessionCatalog.scala:87) at com.databricks.sql.analyzer.UnresolvedFunctionLogging.$anonfun$resolveBuiltinOrTempFunction$1(UnresolvedFunctionLogging.scala:80) at com.databricks.sql.analyzer.UnresolvedFunctionLogging.recordFailure(UnresolvedFunctionLogging.scala:97) at com.databricks.sql.analyzer.UnresolvedFunctionLogging.resolveBuiltinOrTempFunction(UnresolvedFunctionLogging.scala:80) at com.databricks.sql.analyzer.UnresolvedFunctionLogging.resolveBuiltinOrTempFunction$(UnresolvedFunctionLogging.scala:78) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.resolveBuiltinOrTempFunction(Analyzer.scala:2844) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveFunctions$$resolveBuiltinOrTempFunction(Analyzer.scala:3138) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$23$$anonfun$applyOrElse$172.$anonfun$applyOrElse$177(Analyzer.scala:3062) at org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:103) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$23$$anonfun$applyOrElse$172.applyOrElse(Analyzer.scala:3062) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$23$$anonfun$applyOrElse$172.applyOrElse(Analyzer.scala:3031) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$4(TreeNode.scala:573) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:573) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:566) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1319) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1318) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.mapChildren(Expression.scala:669) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:566) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUpWithPruning$1(QueryPlan.scala:209) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:221) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:221) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:233) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:239) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.immutable.List.foreach(List.scala:431) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.immutable.List.map(List.scala:305) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:239) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$5(QueryPlan.scala:244) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:358) at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:244) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUpWithPruning(QueryPlan.scala:209) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$23.applyOrElse(Analyzer.scala:3031) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$23.applyOrElse(Analyzer.scala:2850) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:141) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:141) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:436) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:137) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:133) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:40) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.apply(Analyzer.scala:2850) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.apply(Analyzer.scala:2844) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$4(RuleExecutor.scala:327) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$3(RuleExecutor.scala:327) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:324) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeBatch$1(RuleExecutor.scala:307) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$9(RuleExecutor.scala:411) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$9$adapted(RuleExecutor.scala:411) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:411) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:270) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeSameContext(Analyzer.scala:423) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:416) at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:329) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:416) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:348) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:262) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:168) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:262) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:401) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:443) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:400) at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:261) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:427) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$5(QueryExecution.scala:611) at org.apache.spark.sql.execution.SQLExecution$.withExecutionPhase(SQLExecution.scala:143) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$4(QueryExecution.scala:611) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:1164) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:610) at com.databricks.util.LexicalThreadLocal$Handle.runWith(LexicalThreadLocal.scala:63) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:606) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1180) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:606) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:255) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:254) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:236) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:130) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1180) at org.apache.spark.sql.SparkSession.$anonfun$withActiveAndFrameProfiler$1(SparkSession.scala:1187) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.SparkSession.withActiveAndFrameProfiler(SparkSession.scala:1187) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:122) at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:959) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1180) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:947) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:982) at com.databricks.backend.daemon.driver.DriverLocal$DbClassicStrategy.executeSQLQuery(DriverLocal.scala:290) at com.databricks.backend.daemon.driver.DriverLocal.executeSQLSubCommand(DriverLocal.scala:390) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$executeSql$1(DriverLocal.scala:411) at scala.collection.immutable.List.map(List.scala:293) at com.databricks.backend.daemon.driver.DriverLocal.executeSql(DriverLocal.scala:406) at com.databricks.backend.daemon.driver.JupyterDriverLocal.repl(JupyterDriverLocal.scala:930) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$30(DriverLocal.scala:1138) at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:45) at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:103) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$25(DriverLocal.scala:1129) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:253) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:249) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:87) at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95) at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:87) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$1(DriverLocal.scala:1073) at com.databricks.backend.daemon.driver.DriverLocal$.$anonfun$maybeSynchronizeExecution$4(DriverLocal.scala:1484) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:764) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$2(DriverWrapper.scala:826) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:818) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$3(DriverWrapper.scala:858) at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:636) at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:654) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:253) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:249) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.backend.daemon.driver.DriverWrapper.withAttributionContext(DriverWrapper.scala:70) at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95) at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76) at com.databricks.backend.daemon.driver.DriverWrapper.withAttributionTags(DriverWrapper.scala:70) at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:631) at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:541) at com.databricks.backend.daemon.driver.DriverWrapper.recordOperationWithResultTags(DriverWrapper.scala:70) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:858) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:703) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:770) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$runInnerLoop$1(DriverWrapper.scala:576) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:253) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:249) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.backend.daemon.driver.DriverWrapper.withAttributionContext(DriverWrapper.scala:70) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:576) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:498) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:292) at java.lang.Thread.run(Thread.java:750) at com.databricks.backend.daemon.driver.DriverLocal.executeSql(DriverLocal.scala:463) at com.databricks.backend.daemon.driver.JupyterDriverLocal.repl(JupyterDriverLocal.scala:930) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$30(DriverLocal.scala:1138) at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:45) at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:103) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$25(DriverLocal.scala:1129) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:253) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:249) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:87) at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95) at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:87) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$1(DriverLocal.scala:1073) at com.databricks.backend.daemon.driver.DriverLocal$.$anonfun$maybeSynchronizeExecution$4(DriverLocal.scala:1484) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:764) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$2(DriverWrapper.scala:826) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:818) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$3(DriverWrapper.scala:858) at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:636) at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:654) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:253) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:249) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.backend.daemon.driver.DriverWrapper.withAttributionContext(DriverWrapper.scala:70) at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95) at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76) at com.databricks.backend.daemon.driver.DriverWrapper.withAttributionTags(DriverWrapper.scala:70) at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:631) at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:541) at com.databricks.backend.daemon.driver.DriverWrapper.recordOperationWithResultTags(DriverWrapper.scala:70) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:858) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:703) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:770) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$runInnerLoop$1(DriverWrapper.scala:576) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:253) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:249) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.backend.daemon.driver.DriverWrapper.withAttributionContext(DriverWrapper.scala:70) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:576) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:498) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:292) ```
ST_DWithin(
    ST_GeomFromWKT("POINT(-122.335167 47.608013)"), 
    ST_GeomFromWKT("POINT(-73.935242 40.730610)"), 
    lit(4000000), 
    use_sphere=True,
)
Py4JError ``` Py4JError: An error occurred while calling z:org.apache.spark.sql.sedona_sql.expressions.st_predicates.ST_DWithin. Trace: py4j.Py4JException: Method ST_DWithin([class org.apache.spark.sql.Column, class org.apache.spark.sql.Column, class org.apache.spark.sql.Column, class org.apache.spark.sql.Column]) does not exist at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:344) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:365) at py4j.Gateway.invoke(Gateway.java:300) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199) at py4j.ClientServerConnection.run(ClientServerConnection.java:119) at java.lang.Thread.run(Thread.java:750) File , line 1 ----> 1 sp.ST_DWithin( 2 sc.ST_GeomFromWKT("POINT(-122.335167 47.608013)", 4326), 3 sc.ST_GeomFromWKT("POINT(-73.935242 40.730610)", 4326), 4 f.lit(4000000), 5 True 6 ) File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/sedona/sql/dataframe_api.py:156, in validate_argument_types..validated_function(*args, **kwargs) 153 type_annotations = typing.get_type_hints(f) 154 _check_bound_arguments(bound_args, type_annotations, f.__name__) --> 156 return f(*args, **kwargs) File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/sedona/sql/st_predicates.py:224, in ST_DWithin(a, b, distance, use_sphere) 215 """ 216 Check if geometry a is within 'distance' units of geometry b 217 :param a: Geometry column to check (...) 221 :return: True if a is within distance units of Geometry b 222 """ 223 args = (a, b, distance, use_sphere) if use_sphere is not None else (a, b, distance,) --> 224 return _call_predicate_function("ST_DWithin", args) File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/sedona/sql/dataframe_api.py:65, in call_sedona_function(object_name, function_name, args) 62 jobject = getattr(spark._jvm, object_name) 63 jfunc = getattr(jobject, function_name) ---> 65 jc = jfunc(*args) 66 return Column(jc) File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1355, in JavaMember.__call__(self, *args) 1349 command = proto.CALL_COMMAND_NAME +\ 1350 self.command_header +\ 1351 args_command +\ 1352 proto.END_COMMAND_PART 1354 answer = self.gateway_client.send_command(command) -> 1355 return_value = get_return_value( 1356 answer, self.gateway_client, self.target_id, self.name) 1358 for temp_arg in temp_args: 1359 if hasattr(temp_arg, "_detach"): File /databricks/spark/python/pyspark/errors/exceptions/captured.py:248, in capture_sql_exception..deco(*a, **kw) 245 from py4j.protocol import Py4JJavaError 247 try: --> 248 return f(*a, **kw) 249 except Py4JJavaError as e: 250 converted = convert_exception(e.java_exception) File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:330, in get_return_value(answer, gateway_client, target_id, name) 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". 328 format(target_id, ".", name), value) 329 else: --> 330 raise Py4JError( 331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n". 332 format(target_id, ".", name, value)) 333 else: 334 raise Py4JError( 335 "An error occurred while calling {0}{1}{2}". 336 format(target_id, ".", name)) ```

Steps to reproduce the problem

Try to use ST_DWithin with optional useSpheroid/use_sphere argument in either SparkSQL or PySpark.

Settings

Sedona version = 1.5.3 Sedona python pkg = 1.6.0

Apache Spark version = 3.5.0

Apache Flink version = ?

API type = Python, SQL

Scala version = 2.12

JRE version = ?

Python version = 3.11

Environment = Azure Databricks (DBR 15.3 ML)

github-actions[bot] commented 1 month ago

Thank you for your interest in Apache Sedona! We appreciate you opening your first issue. Contributions like yours help make Apache Sedona better.

Kontinuation commented 1 month ago

I have not reproduced this problem on DBR 15.3, so this could be a configuration problem.

The fourth parameter of ST_DWithin is introduced in 1.6.0, the error message saying that ST_DWithin only expects 3 arguments indicates that it is using an old version of Sedona. Please check that the 1.6.1 Sedona JAR is deployed to your DBR cluster, and it is the only Sedona JAR deployed. Mixing multiple versions of Sedona JARs will lead to all sorts of strange behaviors.

If you are using the init script described here, please make sure that the workspace directory /Workspace/Shared/sedona/1.6.1/ contains only one sedona-spark-shaded jar.

joaofauvel commented 3 weeks ago

It was a configuration issue. The cluster had a different version of the apache-sedona python package (the latest because it wasn't pinned to the same version as the jar), which is why from python the function did accept all 4 arguments.