apache / datafusion-comet

Apache DataFusion Comet Spark Accelerator
https://datafusion.apache.org/comet
Apache License 2.0
611 stars 113 forks source link

Support Spark bloom filter expression BloomFilterMightContain #145

Closed viirya closed 3 months ago

viirya commented 3 months ago

What is the problem the feature request solves?

While I checked query plans of other works, I found there are some TPC-H queries (e.g., q2, q5, q7, q8) containing bloom filter expression BloomFilterMightContain which blocks Comet transforming Spark query plan. We should support it to unblock these part of queries.

Describe the potential solution

No response

Additional context

No response

singhpk234 commented 3 months ago

can i pick this up ?

viirya commented 3 months ago

Sure. Feel free to take any issue no one claims working on it. Thank you.

viirya commented 3 months ago

@advancedxy

Hmm, thanks for the contribution. However, could you make sure what you plan to do and claim in the related ticket? And also please claim/pick one ticket at one time. I'm asking this because you claim you will/want to do some other tickets (more than one) but you never say you will or are working in this. You can see there is other person picking this available ticket yesterday.

A clear claim of the willing to work on tickets would be good to community collaboration. For example, others probably are interested in the tickets you said you want to work but cannot work on the tickets due to that. And, this ticket is available and the person picked it up yesterday might already work on it. We should avoid to step on others toes, I think.

So my suggestion is, please claim/pick only one ticket at one time, and let others know you are/will work on it soon. If you find you cannot finish it because of tight schedule or other reasons, also let others know in the ticket. I think it will build a more smooth development community for us.

Thank you.

advancedxy commented 3 months ago

However, could you make sure what you plan to do and claim in the related ticket? And also please claim/pick one ticket at one time. I'm asking this because you claim you will/want to do some other tickets (more than one) but you never say you will or are working in this. You can see there is other one picking this available ticket yesterday.

Good suggestion indeed. I should claim that I am planning to work on it earlier. To be frankly, I think I only claimed that I’m planning to work on the InSubquery support, which I already expressed that I will postponed the work until the join support is added. After that, I was planning to pick up other issues and this issue caught my eyes. I thought about the solutions to this problem and I haven’t started coding on it in the weekdays, so the claim is never made.

@singhpk234 Sorry, I didn’t notice that you are claiming this issue when I actually started the work. If your work is almost done, I can close my pr in favor of yours since you claimed it first. If you barely start it, I would appreciate your input on my current pr.

advancedxy commented 3 months ago

So my suggestion is, please claim/pick only one ticket at one time, and let others know you are/will work on it soon. If you find you cannot finish it because of tight schedule or other reasons, also let others know in the ticket. I think it will build a more smooth development community for us.

Suggestion well received. It will work much better if we are all claiming issues clearly and avoid redundant work.

viirya commented 3 months ago

Thank you @advancedxy . Appreciate your understanding and your contribution so far!

singhpk234 commented 3 months ago

@singhpk234 Sorry, I didn’t notice that you are claiming this issue when I actually started the work. If your work is almost done, I can close my pr in favor of yours since you claimed it first. If you barely start it, I would appreciate your input on my current pr.

@advancedxy i was in the middle, was done with spark changes and was about to start rust changes. I see you have complete implementation out, i think it would be better to go ahead with your pr, rather than closing it now, it's just that my efforts got wasted. It would have been better if you would have just commented on the ticket rather than just directly submitting the pr. But no issues :), All good !

Looking forward to contributing more with you !

advancedxy commented 3 months ago

it's just that my efforts got wasted. It would have been better if you would have just commented on the ticket rather than just directly submitting the pr.

Sorry again for your wasted effort and time. It's on me that I didn't check the issue status/comments when I actually started the work. I will claim issues or state what I'm planning to work clearly in the future and hopefully things like this never happen again.

Looking forward to contributing more with you !

Same here. I think it would be great to collaborate with you on this.

viirya commented 3 months ago

Thanks @singhpk234 for your effort. Feel free to take any other available tickets you are interested.