airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.59k stars 4.01k forks source link

Add pre and post SQL query in jdbc connectors #5541

Open marcosmarxm opened 3 years ago

marcosmarxm commented 3 years ago

Tell us about the problem you're trying to solve

From slack convo some users are interested in running a pre and post queries.

Describe the solution you’d like

A clear and concise description of what you want to see happen, or the change you would like to see

Describe the alternative you’ve considered or used

A clear and concise description of any alternative solutions or features you've considered or are using today.

Additional context

Add any other context or screenshots about the feature request here.

Are you willing to submit a PR?

Remove this with your answer :-)

shmf commented 3 years ago

It would be very useful to have the capability of firing custom SQL statements before and after the extraction. The idea would be to use these pre and post queries to set and unset session variables which can have impact on the extraction itself.

When defining the connection, if the source or the target is a relational database (mssql, postgres, mysql,..) there should be a field where users can type the queries to be fired before the execution starts and after it finishes.

peder1001 commented 3 years ago

+1

I have some very large tables, and would love to sync the table, but adding a WHERE clause so that only the rows matching the where clause are moved/synced to the destination

SudhenduP commented 2 years ago

+1 for the feature req. My requirement is to bring a subset of data out of the source (10%). Any timeline on when this feature will be taken up :) Specifically looking for MySQL. Thanks!

marcosmarxm commented 2 years ago

+1 for the feature req. My requirement is to bring a subset of data out of the source (10%). Any timeline on when this feature will be taken up :) Specifically looking for MySQL. Thanks!

There is no estimate time for this feature @SudhenduP but you could create a view and sync the view instead.

peder1001 commented 2 years ago

This is not true. If you have full refresh then your view gets deleted. On 23 Dec 2021, 20:52 +0100, Marcos Marx @.***>, wrote:

+1 for the feature req. My requirement is to bring a subset of data out of the source (10%). Any timeline on when this feature will be taken up :) Specifically looking for MySQL. Thanks! There is no estimate time for this feature @SudhenduP but you could create a view and sync the view instead. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

prudhvi85 commented 2 years ago

In many applications, developers don't maintain proper timestamps which can be used as a cursor. we might need to write a complicated where query to get the delta. We are in need of this kind of feature as well.

ChristopheDuong commented 2 years ago

This other issue seems to also be related to this enhancement

FabSN commented 2 years ago

In my case, I need to do a lookback window with an incremental synchronisation to get the last X days of data. For now, I use view in my database but it's not optimal because I'm not administrator of all the database in my company. I saw that this feature exist for Stripe connector for example.

grishick commented 1 year ago

Tagging for DB Sources, because the conversation in Slack that prompted this issue was about running SET statement_timeout