Open jaames-bentley opened 3 months ago
Feeling like a bit of a numpty - after a day of debugging I found the issue is also solved by using a different Python version.
Version originally used: 3.12.4
Version which fixes the issue: 3.11.9
I'll leave this issue up as the docs do say all versions of Python 3.8+ are supported, but feel free to close if no fix / update is planned.
Hi @jaames-bentley! Thank you for reporting this issue (and for partially figuring it out while I wasn't around)!
Why does this delay in engine creation only occur in a FastAPI app (is it something to do with it being asynchronous?)
TBH, I don't know. There are many things that work different in FastAPI/uvicorn, comparing to WSGI server or CLI apps, and I'm not so deeply familiar with all the differences. it may be some kind of cache miss (Python should cache .pyc
files after first import and then reuse it, but if your app spawns some background workers - they may have conflicts on this step). It may be something caused by async features of Python itself, or uvloop
, or even both.
It would be perfect if you can create an example that stabily reproduces the issue (e.g. dockerfile). Or, at least, please share more details about your OS and Python version, other Python packages installed, how exactly you run your app (maybe, there are some command line switches passed ty Python interpreter, which change its behavior) - for both your environments.
Can ttypes.py be safely edited to remove all "None" rows or is this not a viable solution?
No, generally this file shouldn't be edited. Most of that None values are somehow used by the thrift
package, and thus should be available. Probably there are some structures we don't use in out library, but it's hardly possible to figure that out
I'm a bit late to the party here but this is an exact duplicate behaviour from here https://github.com/databricks/databricks-sql-python/issues/369#issuecomment-2000352199 and does appear to be an issue with certain other libraries that have not been updated to Python 3.12.
Hello,
I am currently working on a FastAPI application which calls Databricks using
databricks-sql-connector
, however the code appears to slow down massively on the engine creation stage.I've narrowed the issue down to
.venv\Lib\site-packages\databricks\sql\thrift_api\TCLIService\ttypes.py
, where when running in debug mode the script is loaded incredibly slowly (around 5-10 minutes). However when the exact same code is run outside of a FastAPI function (ie, in a standalone script, notebook, etc) it runs almost instantly and the engine is created in under 1 second.The details of my test application are:
main.py
:test_route.py
:requirements.txt
:The same issue occurs when using:
but given it all goes to create_engine under the hood I was trying to simplify my test case.
As mentioned, the call stack points to
.venv\Lib\site-packages\databricks\sql\thrift_api\TCLIService\ttypes.py
taking all of this extra time. This can be fixed through editing ttypes.py directly and removing all rows which beginNone, #
. However this directly violates the warning given at the top ofttypes.py
about not editing. Removing these rows reduces the file from ~100,000 lines to ~10,000, and completely solves the time delay issue when using FastAPI.So my main questions are:
ttypes.py
be safely edited to remove all "None" rows or is this not a viable solution?Thanks!