astronomer / airflow-provider-duckdb

A provider package for DuckDB
Apache License 2.0
14 stars 3 forks source link

Add a method in hook to return cursor for multiple parallel database operations #6

Open sunank200 opened 1 year ago

sunank200 commented 1 year ago

Currently, if multiple database operations are done on DuckDB parallelly, it doesn't work with the current provider by default. Reason for this is that in DuckDB a single connection is thread-safe but is locked for the duration of the queries, effectively serializing database access in this case.

As per documentation, the hook should also return the cursor instead. If you want to create a second connection to an existing database, you can use the cursor() method.

pgzmnk commented 1 year ago

Agreed. It might make sense to return the cursor on the catch statement of a try:catch block on the specific exception.

sqlalchemy.exc.OperationalError: (duckdb.IOException) IO Error: Could not set lock on file "/tmp/db.duckdb": Resource temporarily unavailable