starlake-ai / starlake

Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
http://starlake.ai/
Apache License 2.0
57 stars 22 forks source link

[BUG] - extract from duckdb (tutorial) not working with error java.sql.SQLException: Connection Error: Can't open a connection to same database #1087

Open zedach opened 1 month ago

zedach commented 1 month ago

Description

extract from duckdb (tutorial) not working with error Exception in thread "main" java.sql.SQLException: Connection Error: Can't open a connection to same database file with a different configuration than existing connections. connections are not closing correctly.

Expected behavior

extraction work properly with files extracted for duckdb

Current Behavior

A clear description of what the bug is. What is affected? Who is it affecting? When does it occur? Where does it happen?

Steps to reproduce

Steps to reproduce the behavior:

  1. download duckdb.db sample
  2. configure an extract file
  3. launch starlake extract-data --config metadata/extract/my_extract_config.sl.yml --outputDir datasets/incoming/starbake
  4. See error java.sql.SQLException: Connection Error: Can't open a connection to same database file with a different configuration than existing connections at org.duckdb.DuckDBNative.duckdb_jdbc_startup(Native Method) at org.duckdb.DuckDBConnection.newConnection(DuckDBConnection.java:51) at org.duckdb.DuckDBDriver.connect(DuckDBDriver.java:41) at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:681) at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:190) at ai.starlake.extract.JdbcDbUtils$StarlakeConnectionPool$.getConnection(JdbcDbUtils.scala:66) at ai.starlake.extract.JdbcDbUtils$.$anonfun$withJDBCConnection$1(JdbcDbUtils.scala:105) at scala.util.Try$.apply(Try.scala:217) at ai.starlake.extract.JdbcDbUtils$.withJDBCConnection(JdbcDbUtils.scala:105) at ai.starlake.extract.ExtractDataJob.$anonfun$extractTableData$4(ExtractDataJob.scala:522) at scala.util.Using$.resource(Using.scala:296) at ai.starlake.extract.ExtractDataJob.$anonfun$extractTableData$3(ExtractDataJob.scala:469) at ai.starlake.extract.JdbcDbUtils$.$anonfun$withJDBCConnection$2(JdbcDbUtils.scala:111)

Context

Add any other context about the problem here: environment, custom data or schema, version...

Life cycle

Possible implementation

wecan enforce readonly and permit RW connection only on audit insertions with proper close connection.

Complexity estimation

T-Shirt Size: XS, S, M, L. If you would recommend XL or more, we advise you to discuss with the community how to split this issue. Don't forget to tag this issue as a 'good first issue' if you think it is appropriate!