blue-yonder / turbodbc

Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
http://turbodbc.readthedocs.io/en/latest
MIT License
623 stars 85 forks source link

Mention SQLite in docs #246

Open cdeil opened 4 years ago

cdeil commented 4 years ago

Could you please add some information in the docs whether turbodbc support SQLite or not? Is it fast or slow?

There doesn't seem to be a single mention of "sqlite" in the docs currently: https://turbodbc.readthedocs.io/en/latest/search.html?q=sqlite&check_keywords=yes&area=default

My current use case is that I have ~ 10 pandas Dataframes, each ~ 100 MB in size, and would like to write them concatenated to a SQLite file. Using sqlalchemy.create_engine(f'sqlite:///db.sqlite') and then in a loop Dataframe.to_sql with if_exists="append" seems to work, but it's extremely slow. Is turbodbc the right tool for this job, or something else?

xhochy commented 4 years ago

This depends on the performance of the ODBC driver of SQLite itself. I have not yet heard of much usage of it, so my expectations aren't that great. But in the end: just try it and see whether it is faster.

cdeil commented 3 years ago

Actually, for others wondering about this, the Python https://docs.python.org/3/library/sqlite3.html doesn't ship an ODBC driver at all.

So any attempt like this

import turbodbc
turbodbc.connect("db.sqlite")
turbodbc.connect("sqlite:///db.sqlite")

will fail with an error like

turbodbc.exceptions.DatabaseError: ODBC error
state: IM002
native error code: 0
message: [unixODBC][Driver Manager]Data source name not found and no default driver specified

Related idea, but I think not realised yet: https://github.com/pandas-dev/pandas/issues/17790#issuecomment-334322878

pacman82 commented 3 years ago

Just to chip in here a little. I recently tested inserting into an SQLite Database. I inserted a csv file 25k rows (somewhat larger than 3MiB) into an SQLite DB using odbcsv. It took over two minutes. Now odbcsv is an entirely different code base, but also utilizes bound buffers for bulk insertion. So I don't know how slow your current way of inserting is, but ODBC bulk inserting into SQLite is far from speedy.