dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.04k stars 1.38k forks source link

[duckdb] Decrease likelyhood of failures due to connection locks on the db #18746

Open jamiedemaria opened 8 months ago

jamiedemaria commented 8 months ago

Duckdb only allows one connection to have a write connection at a time. We get around this a bit by having retries when a connection is made. However, this doesn't account for all cases. One thing we could do is only make write connections when we are actually writing to the DB. This would require an update to the connect function of DBClient to accept a parameter to specify whether the connection is read or write. We can make this a kwarg with defaults so that any custom DBClients out there won't see breaking changes. Then in the DuckDB io manager we can use this to determine what kind of connection to make

Discussed in https://github.com/dagster-io/dagster/discussions/18737

Originally posted by **EtienneT** December 14, 2023 I use duckdb with dagster and it has been working great except for one thing. I sometimes get the following error message: `duckdb.duckdb.IOException: IO Error: Cannot open file "c:\dagster-home\storage\data.duckdb": The process cannot access the file because it is being used by another process.` But dagster is the only process using this file. In most software defined assets I use duckdb with the IOManager. But in one asset, I have to use DuckDBResource to be able to change the table schema (add new columns on the fly). From my Definitions.resources: ``` "duckdb": DuckDBPandasPolarsIOManager(database=str(storage / 'data.duckdb'), schema='main'), "duckdb_resource": DuckDBResource(database=str(storage / 'data.duckdb'), schema='main'), ``` Could the IOManager and the DuckDBResource compete for the unique write connection to duckdb? Thanks!
EtienneT commented 8 months ago

@jamiedemaria I submitted a quick PR if you ever have time to take a look. Thanks!