apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.22k stars 438 forks source link

[VL] Result mismatch issues Tracker #4652

Open zhouyuan opened 9 months ago

zhouyuan commented 9 months ago

Backend

VL (Velox)

Bug description

There are several data mismatch issues either related with operator or functions. Some of the fixes are landed in Gluten, and some are in Velox repo. We will use this issue to track the status as these are critical for production envs.

FelixYBW commented 9 months ago

4678 issue in hashagg

FelixYBW commented 9 months ago

https://github.com/oap-project/gluten/issues/4587

Currently we disabled all complex data read

zhouyuan commented 8 months ago

https://github.com/oap-project/gluten/pull/4818

zhouyuan commented 8 months ago

https://github.com/oap-project/gluten/pull/4872

kecookier commented 8 months ago

https://github.com/apache/incubator-gluten/issues/4891

kecookier commented 8 months ago

https://github.com/apache/incubator-gluten/issues/4928

kecookier commented 8 months ago

https://github.com/apache/incubator-gluten/issues/4930

kecookier commented 8 months ago

https://github.com/apache/incubator-gluten/issues/4947

FelixYBW commented 8 months ago

3 issues we met:

  1. parquet scan + filter pushdown wrongly return "", should return null. Fixed by https://github.com/facebookincubator/velox/pull/9129
  2. distinct hash agg + spill returned duplicated keys. https://github.com/facebookincubator/velox/issues/9219
  3. max_by function return wrong result
ulysses-you commented 8 months ago
  1. distinct hash agg + spill returned duplicated keys.

@FelixYBW Has this issue not been fixed by https://github.com/apache/incubator-gluten/pull/4443 ?

FelixYBW commented 8 months ago

@FelixYBW Has this issue not been fixed by #4443 ?

No, it's tested from main branch. A new issue

FelixYBW commented 8 months ago

No, it's tested from main branch. A new issue

https://github.com/facebookincubator/velox/issues/9219

FelixYBW commented 8 months ago
  1. max_by function return wrong result

@yma11 Did you submit a fix to the issue?

yma11 commented 8 months ago
  1. max_by function return wrong result

@yma11 Did you submit a fix to the issue?

Not yet. Only have pushed to golden branch and will submit one in Velox upstream.

NEUpanning commented 7 months ago

5253

FelixYBW commented 7 months ago

5253

Looks the issue of get_json_object. @PHILO-HE maybe we need a fully tests of json functions, like the re2.

PHILO-HE commented 7 months ago

5253

Looks the issue of get_json_object. @PHILO-HE maybe we need a fully tests of json functions, like the re2.

@FelixYBW, I will do that. Thanks!

FelixYBW commented 7 months ago

https://github.com/apache/incubator-gluten/issues/5248

kecookier commented 7 months ago

https://github.com/apache/incubator-gluten/issues/5366

FelixYBW commented 7 months ago

5366

UPdated desc. thank you. do you know which function (cast, avg, round ) caused the issue?

FelixYBW commented 7 months ago

5372

yma11 commented 7 months ago
  1. max_by function return wrong result

@yma11 Did you submit a fix to the issue?

Not yet. Only have pushed to golden branch and will submit one in Velox upstream.

@FelixYBW This fix should be done at cpp side. The formal fix is in PR. Can you help review it?

FelixYBW commented 7 months ago

@FelixYBW This fix should be done at cpp side. The formal fix is in PR. Can you help review it?

Is it a Gluten issue? I'd think veox has some bug here.

yma11 commented 7 months ago

@FelixYBW This fix should be done at cpp side. The formal fix is in PR. Can you help review it?

Is it a Gluten issue? I'd think veox has some bug here.

Yes. I think so. It's caused by the additional projects we added before/after shuffle. The logic of partial/final handle in Velox upstream has no problem. The ideal way is to add struct support for shuffle in Gluten so that we can remove the hack.

FelixYBW commented 6 months ago

@PHILO-HE Any update of the issues here?

zjuwangg commented 6 months ago

https://github.com/apache/incubator-gluten/issues/5682

NEUpanning commented 6 months ago

5701

PHILO-HE commented 6 months ago

@PHILO-HE Any update of the issues here?

@FelixYBW, Some were actually fixed. Just updated the list. Will fix or seek help to fix other issues.

kecookier commented 5 months ago

6224

NEUpanning commented 5 months ago

6227

kecookier commented 3 months ago

https://github.com/apache/incubator-gluten/issues/6630

NEUpanning commented 3 months ago

6673

zml1206 commented 3 months ago

6784

jiangjiangtian commented 3 months ago

https://github.com/apache/incubator-gluten/issues/6827

zml1206 commented 3 months ago

6828

NEUpanning commented 3 months ago

6845

FelixYBW commented 3 months ago

6845

Closed. Not a bug.

jiangjiangtian commented 2 months ago

https://github.com/apache/incubator-gluten/issues/7082

kecookier commented 2 months ago

https://github.com/apache/incubator-gluten/issues/7069

ccat3z commented 2 months ago

https://github.com/apache/incubator-gluten/issues/7109

kecookier commented 2 months ago

https://github.com/apache/incubator-gluten/issues/7194

FelixYBW commented 1 month ago

7494

FelixYBW commented 3 weeks ago

7730

wForget commented 3 weeks ago

7749

wForget commented 3 weeks ago

7777