Ingest data from ODBC/JDBC datasources as Source

Is your feature request related to a problem? Please describe.

Pipeline author wants to read data from desired databases. Plugin needs to provide interface to read/ingest data with a JDBC interface.

Describe the solution you'd like

The interface should also provide ability to read data and run queries periodically. Every row read will be converted to Data prepper event. Columns should be mapped as fields in event. I would envision JDBC driver libraries to be provided in yml configuration by pipeline author. User should be able to pass required configuration for drivers under "jdbc_driver_lib". Additionally for scheduling a periodic run a cron like syntax configuration should be passed in the yml.

The plugin should be able to support SigV4 and accept awsCrediential provider, region, security parameters in yml configuration which should include trust & keystore configurations - trustStoreLocation,trustStoreType,trustStorePassword, keyStoreLocation,KeyStoreType,keyStorePassword

The plugin should include support for multi-node worker partitioning.

source:
    - jdbc:
          jdbc_driver_lib: "jdbc-oracle.jar"
          jbdc_driver:"oracle.jdbc.driver.OracleDriver"
          jdbc_connection_string:"jdbc:oracle://127.0.0.1:8080"
          jdbc_user:"user"
          jdbc_schedule:"* * * 3 *"
          sql_query:"SELECT EMPLOYEE_ID FROM EMPLOYEES WHERE LAST_NAME= :LAST_NAME"
          fetchSize: " "
          awsCredentialsProvider: "com.amazonaws.opensearch.sql.jdbc.shadow.com.amazonaws.auth.AWSCredentialsProvider"

Additional context

https://github.com/opensearch-project/sql-jdbc

opensearch-project / data-prepper

Ingest data from ODBC/JDBC datasources as Source #1995