Open cdeil opened 4 years ago
This depends on the performance of the ODBC driver of SQLite itself. I have not yet heard of much usage of it, so my expectations aren't that great. But in the end: just try it and see whether it is faster.
Actually, for others wondering about this, the Python https://docs.python.org/3/library/sqlite3.html doesn't ship an ODBC driver at all.
So any attempt like this
import turbodbc
turbodbc.connect("db.sqlite")
turbodbc.connect("sqlite:///db.sqlite")
will fail with an error like
turbodbc.exceptions.DatabaseError: ODBC error
state: IM002
native error code: 0
message: [unixODBC][Driver Manager]Data source name not found and no default driver specified
Related idea, but I think not realised yet: https://github.com/pandas-dev/pandas/issues/17790#issuecomment-334322878
Just to chip in here a little. I recently tested inserting into an SQLite Database. I inserted a csv file 25k rows (somewhat larger than 3MiB) into an SQLite DB using odbcsv
. It took over two minutes. Now odbcsv
is an entirely different code base, but also utilizes bound buffers for bulk insertion. So I don't know how slow your current way of inserting is, but ODBC bulk inserting into SQLite is far from speedy.
Could you please add some information in the docs whether turbodbc support SQLite or not? Is it fast or slow?
There doesn't seem to be a single mention of "sqlite" in the docs currently: https://turbodbc.readthedocs.io/en/latest/search.html?q=sqlite&check_keywords=yes&area=default
My current use case is that I have ~ 10 pandas Dataframes, each ~ 100 MB in size, and would like to write them concatenated to a SQLite file. Using
sqlalchemy.create_engine(f'sqlite:///db.sqlite')
and then in a loopDataframe.to_sql
withif_exists="append"
seems to work, but it's extremely slow. Is turbodbc the right tool for this job, or something else?