pudo / dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
https://dataset.readthedocs.org/
MIT License
4.78k stars 298 forks source link

SQLite: StaticPool leads to never closed file descriptors #283

Closed pohmelie closed 4 years ago

pohmelie commented 5 years ago
db = dataset.connect("sqlite:///test.sqlite")
# lsof -p pid | grep test
# nothing
print(db.tables)
# lsof
# ... /path/to/test.sqlite
del db
# lsof
# ... /path/to/test.sqlite
db = dataset.connect("sqlite:///test.sqlite")
print(db.tables)
# lsof
# ... /path/to/test.sqlite
# ... /path/to/test.sqlite

Dataset forces to use StaticPool, but this leads to reaching opened file descriptors limit. So after some time you can't accept sockets nor open files. We have case, where our code use hundreds of sqlite db files.

Workaround is to use default pool:

db = dataset.connect(url, , engine_kwargs=dict(poolclass=None))
pudo commented 4 years ago

So this seems to connect with (and contradict) the previous discussion in #163. I'm simply not sure what the correct default here is, both solutions seem defective in their own way. Does anyone have a super-strong opinion on this, given the work-around documented above is relatively simple?

pohmelie commented 4 years ago

Not sure this should be closed, since it still there. I think «correct» default is sqlalchemy default.

pudo commented 4 years ago

Alright, let's give it a shot and see what the critics say ;)