apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.2k stars 434 forks source link

[VL] Diff on parquet agg #7494

Open zml1206 opened 3 weeks ago

zml1206 commented 3 weeks ago

Backend

VL (Velox)

Bug description

spark.sql("set spark.gluten.enabled=false")
spark.range(100).selectExpr("id%2 as c1", "id%5 as c2", "id as c3").write.mode("overwrite").parquet("tmp/t1")
spark.sql("set spark.gluten.enabled=true")
spark.read.parquet("tmp/t1").createOrReplaceTempView("t1")
spark.sql("select c2, sum(c3)  from t1 where  c1= 1 group by c2").show

result

+---+---------------+
| c2|        sum(c3)|
+---+---------------+
|  0|559882429285360|
|  1|559885503421750|
|  3|839826576815406|
|  2|839827141809990|
|  4|559885785918562|
+---+---------------+

I tested three versions. The velox-08-27 version is normal, but the velox-10-11 and velox-10-04 versions are abnormal.

Spark version

Spark-3.4.x

Spark configurations

No response

System information

No response

Relevant logs

No response

zml1206 commented 3 weeks ago

cc @rui-mo @FelixYBW @zhztheplayer Have you encountered similar problems?

FelixYBW commented 3 weeks ago

No. Looks like a new Velox bug. Would you debug it?

zml1206 commented 3 weeks ago

No. Looks like a new Velox bug. Would you debug it?

Sorry, there are problems with local mac compilation of new version, I can’t debug it for the time being.

zml1206 commented 3 weeks ago

One more thing, I cannot reproduce on mac.

zml1206 commented 3 weeks ago

Through testing, found that https://github.com/facebookincubator/velox/pull/11010 caused, it worked after reverted it.

FelixYBW commented 3 weeks ago

Thank you for update. Did you submit an issue to Velox?

zml1206 commented 3 weeks ago

Thank you for update. Did you submit an issue to Velox?

https://github.com/facebookincubator/velox/issues/11257