apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.14k stars 411 forks source link

[CH] Not supported operator ShuffleQueryStage for BroadcastRelation #3571

Open shuai-xu opened 10 months ago

shuai-xu commented 10 months ago

Backend

CH (ClickHouse)

Bug description

When run spark sqls, meet an exception: image The spark plan is: image Bigo sql 5224_1

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

shuai-xu commented 10 months ago

`insert overwrite table tmp.test_gluten_d_5224_1_6 partition (day) select bid, t4.vgift_typeid, vgift_count, valuable, valuable_range, '${d1}' from ( select to_uid bid, vgift_typeid, sum(vgift_count) vgift_count from XXXX1 where from_unixtime(ctime, 'yyyy-MM-dd') >= date_sub('${d1}', 30 - 1) and from_unixtime(ctime, 'yyyy-MM-dd') <= '${d1}' group by to_uid, vgift_typeid ) t4 join ( select t1.typeid, valuable, (case when valuable>=1 and valuable<=99 then 1 when valuable>=100 and valuable<=999 then 2 when valuable>=1000 then 3 else 0 end) valuable_range from ( select typeid, valuable from xxxx3 where status=0 and (gift_type=1 or gift_type=9 or gift_type=10) and (vm_typeid=2) and ((groupid&1>0 or groupid&2>0) and (groupid&8>0)) ) t1 join ( select distinct gift_id from XXXXX2 where tab_id=1000 or tab_id=1001
) t3 on t1.typeid=t3.gift_id ) t5 on t4.vgift_typeid=t5.typeid

;`

PHILO-HE commented 10 months ago

cc @zhztheplayer, maybe this is also an issue for Velox backend.

zhztheplayer commented 10 months ago

cc @zhztheplayer, maybe this is also an issue for Velox backend.

Probably but I just went through createBroadcastRelation code in Velox BE it doesn't seem to have such restriction for its input. Which means the same plan might be valid for Velox BE but I'm not sure about it without actually running some tests.

Indeed we had done something wrong in BHJ + exchange joint validation in core module code (I drafted a fix https://github.com/oap-project/gluten/pull/3595) but that bug might not relate to this one I guess.