airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.92k stars 4.09k forks source link

Support request for JDBC Interface #2068

Closed MiyamotoKota closed 3 years ago

MiyamotoKota commented 3 years ago

Tell us about the new integration you’d like to have

I would like to be able to select the JDBC interface as the data source.

Describe the context around this new integration

I would like to verify if I can use a third party JDBC Driver to connect to SaaS that is not yet supported by AirByte. There are a number of JDBC drivers for SaaS. From AirByte, I can use the JDBC interface to select the JDBCDriver I have installed, which will expand the data sources I can use. Also, there are JDBC drivers for SaaS provided by Japanese companies, so I think those can be used with AirByte!

Describe the alternative you are considering or using

I have already confirmed that we can link SaaS data from AirByte to BigQuery using CData Connect. So, I am sure you will use this. However, JDBC is more efficient as it directly accesses the data source.

Thanks,

sherifnada commented 3 years ago

Hey @MiyamotoKota ,

Could you give some more details on how exactly you would want to use this connector? What would be the inputs for configuring the connector, and how would they control its behavior? I just want to make sure I understand the request here.

MiyamotoKota commented 3 years ago

Hi @sherifnada ,

Thanks for reply. I'm imaging can use this CData JDBC Driver. I assume that by specifying the class name and connection string of this JDBC Driver in AirByte, it will be available.

Could see the following CData's help document. http://cdn.cdata.com/help/RFF/jdbc/pg_connectionj.htm

In this above part, you will find sample class names and connection strings for the CData JDBC Driver. If you can implement this, I am sure you will be able to handle the other data sources listed below. https://www.cdata.com/jdbc/

Thanks,

sugimomoto commented 3 years ago

Hi All

Me too. I want to use a JDBC interface in the UI. Currently, various DB / DWH / cloud services provide JDBC Driver. As a standard interface, This supporting will expand the Airbyte ecosystem and to help a lot of persons to connect various services.

For example

Amazon Athena JDBC https://docs.aws.amazon.com/athena/latest/ug/connect-with-jdbc.html

Teradata JDBC https://downloads.teradata.com/download/connectivity/jdbc-driver

DynamoDB JDBC https://aws.amazon.com/jp/dynamodb/community/

Each help documents is providing the JDBC URL and Class Name needed to connect.

One thing to worry about is the file path of the jar file.

Thanks.

cgardens commented 3 years ago

@MiyamotoKota @sugimomoto thanks for your feedback!

Unfortunately, implementing a JDBC source that takes as input only the jdbc connection string and the driver class name isn't realistic. We would be able to connect to the database with no problem. The issue is that each database uses a different SQL syntax and often has different tables where it stores metadata (as well as other difference, e.g. MySQL doesn't really support schemas). That means we need to make small adjustments for each database we support to make sure we are correctly supporting its unique attributes.

We have an abstraction AbstractJdbcSource, that generally makes handling these little tweaks pretty trivial. The cost of adding new JDBC databases is pretty low for us, which is great! But unfortunately, in each case, it does require at least a little dev work. You can check out the implementations of Postgres, Redshift, MSSQL and MySQL to get an idea of how little effort it is.

Our goal is to support all of the databases that you have mentioned. We can definitely add them to the list of the connectors we are prioritizing. Also, as an open source project, we are very happy to accept contributions from folks like you and can help you get started. Please let me know if you think I am missing anything or have any other thoughts!

MiyamotoKota commented 3 years ago

Hi @cgardens

I will again regist the issue when I can visualize it more concretely. Thanks!

dvirginz commented 1 year ago

Hi, A year and a half later, is there a way to connect using the JDBC driver directly? We wish to use our current implementation and usage of AirByte for a client that uses on-prem Teradata. The ETL is already written in Teradata SQL syntax, and what is now missing is the integration to AirByte, if it makes sense.

Thanks

grishick commented 1 year ago

@dvirginz what is the source and what is the destination databases?