getindata / flink-http-connector

Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
Apache License 2.0
136 stars 39 forks source link

HTTP-83 Support SQL queries with project pushdown after a lookup join #84

Closed OlivierZembri closed 3 months ago

OlivierZembri commented 3 months ago

Description

Ability to support SQL queries that add or remove columns on the result of a lookup join query.

Resolves #83

PR Checklist
OlivierZembri commented 3 months ago

@kristoffSC Could you please have look ?

davidradl commented 3 months ago

@OlivierZembri Looks good to me. A couple of comments: When you say "Added support for using the result of a lookup join operation in a subsequent select query that adds or removes columns (project pushdown operation)." I was a bit confused by this wording, are we saying " Adding support for projection pushdown, so for lookup joins the projection columns will be honoured. Prior to this fix the projection columns were not honoured, so the output columns (the projections) could not be changed."

In terms of migration considerations for existing users, would the select be failed or ignored if a projection was attempted. If the projection was ignored then there maybe more or less columns returned to the user after the fix; which would be worth documenting as a side effect / migration consideration.

OlivierZembri commented 3 months ago

@davidradl

In terms of migration considerations for existing users, would the select be failed or ignored if a projection was attempted.

This is an enhancement proposal without which the Flink job fails with an exception java.lang.UnsupportedOperationException: No implementation provided for SupportsProjectionPushDown. Please implement SupportsProjectionPushDown#applyProjection(int[][], DataType)

There might be some other usage this code change allow to support, but ours is defined in the description for this PR (ie. https://github.com/getindata/flink-http-connector/issues/83)

davidradl commented 3 months ago

@davidradl

In terms of migration considerations for existing users, would the select be failed or ignored if a projection was attempted.

This is an enhancement proposal without which the Flink job fails with an exception java.lang.UnsupportedOperationException: No implementation provided for SupportsProjectionPushDown. Please implement SupportsProjectionPushDown#applyProjection(int[][], DataType)

There might be some other usage this code change allow to support, but ours is defined in the description for this PR (ie. #83thanks for the confirmation @OlivierZembri . @kristoffSC unless you have any objections LGTM

AdrianVasiliu commented 3 months ago

@kristoffSC It would make our life much easier if this PR could be reviewed by maintainers and merged relatively quickly.

There's also @davidradl 's https://github.com/getindata/flink-http-connector/pull/87 that we'll also need, but if reviewing/merging it will take more time, would it be possible to release the connector after the merge of the present PR, as this is the first blocker for our current work.

grzegorz8 commented 3 months ago

@kristoffSC It would make our life much easier if this PR could be reviewed by maintainers and merged relatively quickly.

There's also @davidradl 's #87 that we'll also need, but if reviewing/merging it will take more time, would it be possible to release the connector after the merge of the present PR, as this is the first blocker for our current work.

Both PRs approved with a few minor comments. I'll merge them once they are addressed.

OlivierZembri commented 3 months ago

@grzegorz8 Thanks for merging. It would be great to have a release 0.13.0 so that we can use these changes.

grzegorz8 commented 3 months ago

@grzegorz8 Thanks for merging. It would be great to have a release 0.13.0 so that we can use these changes.

Release 0.13.0 build started: https://github.com/getindata/flink-http-connector/actions/runs/8544953258/job/23412174122

But I forgot to ask if we need to update any section in README.md, especially when it comes to https://github.com/getindata/flink-http-connector/pull/87

AdrianVasiliu commented 3 months ago

@grzegorz8 For some reason https://github.com/getindata/flink-http-connector/actions/runs/8544953258/job/23412174122 failed. Could you please look into it?

Regarding the README.md: well, before the last PRs, it didn't cover the details of the extension points, as such I don't see a particular reason to need it covered in the readme now more than before ;-) But yes, of course, in the long run, it would be nice to have a more complete doc. Moreover, maybe one day a default out-of-the-box implementation will provide enough capabilities for handling common use-cases without the need to implement extension points. But this is a matter for future work.

kristoffSC commented 3 months ago

@AdrianVasiliu It seems the publish build went through eventually https://github.com/getindata/flink-http-connector/actions/runs/8544953258/job/23412947418

AdrianVasiliu commented 3 months ago

@kristoffSC Yes, it did, thanks!