teragrep / pth_10

Data Processing Language (DPL) translator for Apache Spark
GNU Affero General Public License v3.0
0 stars 6 forks source link

Exceptions are printed in (some) tests when they should not be #334

Open StrongestNumber9 opened 2 months ago

StrongestNumber9 commented 2 months ago

Describe the bug

Example test iplocationTest_InvalidMmdbPath_8:

[Executor task launch worker for task 0.0 in stage 0.0 (TID 0)] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 4.400833 ms
[Executor task launch worker for task 0.0 in stage 0.0 (TID 0)] INFO com.teragrep.pth10.ast.commands.transformstatement.iplocation.IplocationGeoIPDataMapper - Attempting to open database file: <[/tmp/this-path-is-invalid/fake.mmdb]>
[Executor task launch worker for task 0.0 in stage 0.0 (TID 0)] ERROR org.apache.spark.executor.Executor - Exception in task 0.0 in stage 0.0 (TID 0)
org.apache.spark.SparkException: [FAILED_EXECUTE_UDF] Failed to execute user defined function (functions$$$Lambda$1751/1612554042: (string, string, boolean) => map<string,string>).
    at org.apache.spark.sql.errors.QueryExecutionErrors$.failedExecuteUserDefinedFunctionError(QueryExecutionErrors.scala:217)
    at org.apache.spark.sql.errors.QueryExecutionErrors.failedExecuteUserDefinedFunctionError(QueryExecutionErrors.scala)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
    at scala.collection.Iterator.isEmpty(Iterator.scala:387)
    at scala.collection.Iterator.isEmpty$(Iterator.scala:387)
    at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1431)
    at scala.collection.TraversableOnce.nonEmpty(TraversableOnce.scala:143)
    at scala.collection.TraversableOnce.nonEmpty$(TraversableOnce.scala:143)
    at scala.collection.AbstractIterator.nonEmpty(Iterator.scala:1431)
    at org.apache.spark.rdd.RDD.$anonfun$takeOrdered$2(RDD.scala:1529)
    at org.apache.spark.rdd.RDD.$anonfun$takeOrdered$2$adapted(RDD.scala:1528)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
    at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
    at org.apache.spark.scheduler.Task.run(Task.scala:139)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.RuntimeException: Invalid database file path given for iplocation command.
    at com.teragrep.pth10.ast.commands.transformstatement.iplocation.IplocationGeoIPDataMapper.initInputStream(IplocationGeoIPDataMapper.java:306)
    at com.teragrep.pth10.ast.commands.transformstatement.iplocation.IplocationGeoIPDataMapper.call(IplocationGeoIPDataMapper.java:92)
    at com.teragrep.pth10.ast.commands.transformstatement.iplocation.IplocationGeoIPDataMapper.call(IplocationGeoIPDataMapper.java:73)
    at org.apache.spark.sql.functions$.$anonfun$udf$95(functions.scala:5448)
    ... 29 more

Expected behavior

No such printing as it is expected

How to reproduce

Run that test

Screenshots

Software version

7.1.0

Desktop (please complete the following information if relevant):

Additional context

Found while doing a patch for #305

Such printing might be present in other tests as well

eemhu commented 3 days ago

Looks like the error log is provided by spark internal logging. Perhaps it could be turned off for tests?