apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.21k stars 435 forks source link

[CORE] Protocol message had too many levels of nesting #1727

Closed rui-mo closed 1 year ago

rui-mo commented 1 year ago

Describe the bug

Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message had too many levels of nesting.  May be malicious.  Use CodedInputStream.setRecursionLimit() to increase the depth limit.

        at com.google.protobuf.InvalidProtocolBufferException.recursionLimitExceeded(InvalidProtocolBufferException.java:148)

        at com.google.protobuf.CodedInputStream$ArrayDecoder.readMessage(CodedInputStream.java:869)

        at io.substrait.proto.Expression$FieldReference$Builder.mergeFrom(Expression.java:54368)

        at io.substrait.proto.Expression$FieldReference$Builder.mergeFrom(Expression.java:54151)

        at com.google.protobuf.CodedInputStream$ArrayDecoder.readMessage(CodedInputStream.java:873)

        at io.substrait.proto.Expression$Builder.mergeFrom(Expression.java:61881)

        at io.substrait.proto.Expression$Builder.mergeFrom(Expression.java:61567)

        at com.google.protobuf.CodedInputStream$ArrayDecoder.readMessage(CodedInputStream.java:873)

        at io.substrait.proto.FunctionArgument$Builder.mergeFrom(FunctionArgument.java:600)

        at io.substrait.proto.FunctionArgument$1.parsePartialFrom(FunctionArgument.java:1050)

        at io.substrait.proto.FunctionArgument$1.parsePartialFrom(FunctionArgument.java:1042)

        at com.google.protobuf.CodedInputStream$ArrayDecoder.readMessage(CodedInputStream.java:889)

        at io.substrait.proto.Expression$ScalarFunction$Builder.mergeFrom(Expression.java:19676)

        at io.substrait.proto.Expression$ScalarFunction$Builder.mergeFrom(Expression.java:19372)

To Reproduce

SELECT x2,x49,x18,x47,x29,x56,x60,x9,x31,x34,x53,x11

  FROM t11,t9,t47,t56,t53,t60,t34,t2,t29,t18,t31,t49

 WHERE a9=b11

   AND b53=a11

   AND a49=b9

   AND a31=3

   AND a60=b56

   AND a56=b34

   AND a29=b49

   AND a2=b18

   AND b2=a47

   AND a34=b47

   AND b29=a31

   AND b60=a53

zhouyuan commented 1 year ago

The issue should be fixed by setting new recursion limit - by default it's only 100 https://github.com/protocolbuffers/protobuf/blob/main/java/core/src/main/java/com/google/protobuf/CodedInputStream.java#L390

rui-mo commented 1 year ago

The issue should be fixed by setting new recursion limit - by default it's only 100 https://github.com/protocolbuffers/protobuf/blob/main/java/core/src/main/java/com/google/protobuf/CodedInputStream.java#L390

@zhouyuan Yes, that's right. But since the java code is automatically generated with Substrait proto files, we do not have the access to this function. We still need to find a way to do that.

zhztheplayer commented 1 year ago

bumped into this too, this may only apply to debug build of gluten

23/07/07 18:28:17 ERROR TaskResources: Task 299 failed by error: 
java.lang.RuntimeException: BinaryToJsonStream returned INVALID_ARGUMENT:Message too deep. Max recursion depth reached for type 'substrait.RelCommon', field 'common'
    at io.glutenproject.vectorized.PlanEvaluatorJniWrapper.nativeCreateKernelWithIterator(Native Method)
    at io.glutenproject.vectorized.NativePlanEvaluator.createKernelWithBatchIterator(NativePlanEvaluator.java:84)
    at io.glutenproject.backendsapi.velox.IteratorHandler.genFinalStageIterator(IteratorHandler.scala:178)
    at io.glutenproject.execution.WholeStageZippedPartitionsRDD.$anonfun$genFinalStageIterator$1(WholeStageZippedPartitionsRDD.scala:61)
    at io.glutenproject.execution.WholeStageZippedPartitionsRDD.compute(WholeStageZippedPartitionsRDD.scala:70)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
    at org.apache.spark.scheduler.Task.run(Task.scala:131)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
lgbo-ustc commented 1 year ago

Is there any way to use a provided protobuf jar?

zhztheplayer commented 1 year ago

Another

Job aborted due to stage failure: Task 49 in stage 29.0 failed 1 times, most recent failure: Lost task 49.0 in stage 29.0 (TID 220) (55f8ad28c275 executor driver): java.lang.RuntimeException: BinaryToJsonStream returned INVALID_ARGUMENT:Message too deep. Max recursion depth reached for type 'substrait.Expression.ReferenceSegment.StructField', field 'structField'
    at io.glutenproject.vectorized.PlanEvaluatorJniWrapper.nativeCreateKernelWithIterator(Native Method)
    at io.glutenproject.vectorized.NativePlanEvaluator.createKernelWithBatchIterator(NativePlanEvaluator.java:91)
    at io.glutenproject.backendsapi.velox.IteratorHandler.genFirstStageIterator(IteratorHandler.scala:157)
    at io.glutenproject.execution.GlutenWholeStageColumnarRDD.compute(GlutenWholeStageColumnarRDD.scala:134)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
    at org.apache.spark.scheduler.Task.run(Task.scala:131)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)