Open JUNNIEJUN opened 4 months ago
Keep in mind that sessions in Databricks are not like sessions in a typical database. With postgres, mysql or similar, sessions are cheap and long-lived. On Databricks, a session actually reserves compute resources and incurs some cost. So for cost-savings measures, sessions are expired after some period of time. This timeout is very long for serverless warehouses and is configurable on non-serverless clusters.
If you find an InvalidSessionHandle your best bet is to catch this, open a new session, and re-execute the query. Since creating a new session is generally a quick operation (around 100ms) it should not have a significant affect on your application performance.
The alternative is to periodically run a simple query with the session to keep it open. But keep in mind that doing this will continue to incur a cost compared to the scenario where the cluster auto-closes sessions.
I am applying a SQLDatabaseChain Chatbot model by using LangChain SQLDatabaseChain and GPT4. I first created this model on Databricks notebook like this :
With this code, I deploy this model on MLflow, and then I create a serving endpoint with Unity catalog model.
And after, I create a frontend application by using streamlit.
After creating this serving endpoint for 15 minutes, as long as I send a new question to the chatbot, I will recieve a Invalid SessionHandle error :
{"error_code": "BAD_REQUEST", "message": "1 tasks failed. Errors: {0: 'error: DatabaseError(\\'(databricks.sql.exc.DatabaseError) Invalid SessionHandle: SessionHandle [ee17e137-26ea-4677-a7c3-69e81be048bc]\\') Traceback (most recent call last):\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1910, in _execute_context\\n self.dialect.do_execute(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/engine/default.py\", line 736, in do_execute\\n cursor.execute(statement, parameters)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/databricks/sql/client.py\", line 503, in execute\\n execute_response = self.thrift_backend.execute_command(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 843, in execute_command\\n resp = self.make_request(self._client.ExecuteStatement, req)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 479, in make_request\\n ThriftBackend._check_response_for_error(response)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 259, in _check_response_for_error\\n raise DatabaseError(response.status.errorMessage)\\ndatabricks.sql.exc.DatabaseError: Invalid SessionHandle: SessionHandle [ee17e137-26ea-4677-a7c3-69e81be048bc]\\n\\nThe above exception was the direct cause of the following exception:\\n\\nTraceback (most recent call last):\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflow/langchain/api_request_parallel_processor.py\", line 319, in call_api\\n response = self.single_call_api(callback_handlers)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflow/langchain/api_request_parallel_processor.py\", line 293, in single_call_api\\n response = self.lc_model(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/langchain_core/_api/deprecation.py\", line 148, in warning_emitting_wrapper\\n return wrapped(*args, **kwargs)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/langchain/chains/base.py\", line 383, in __call__\\n return self.invoke(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/langchain/chains/base.py\", line 166, in invoke\\n raise e\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/langchain/chains/base.py\", line 156, in invoke\\n self._call(inputs, run_manager=run_manager)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/langchain_experimental/sql/base.py\", line 119, in _call\\n table_info = self.database.get_table_info(table_names=table_names_to_use)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/langchain_community/utilities/sql_database.py\", line 352, in get_table_info\\n table_info += f\"\\\\n{self._get_sample_rows(table)}\\\\n\"\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/langchain_community/utilities/sql_database.py\", line 375, in _get_sample_rows\\n sample_rows_result = connection.execute(command) # type: ignore\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1385, in execute\\n return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/sql/elements.py\", line 334, in _execute_on_connection\\n return connection._execute_clauseelement(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1577, in _execute_clauseelement\\n ret = self._execute_context(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1953, in _execute_context\\n self._handle_dbapi_exception(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 2134, in _handle_dbapi_exception\\n util.raise_(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/util/compat.py\", line 211, in raise_\\n raise exception\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1910, in _execute_context\\n self.dialect.do_execute(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/sqlalchemy/engine/default.py\", line 736, in do_execute\\n cursor.execute(statement, parameters)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/databricks/sql/client.py\", line 503, in execute\\n execute_response = self.thrift_backend.execute_command(\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 843, in execute_command\\n resp = self.make_request(self._client.ExecuteStatement, req)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 479, in make_request\\n ThriftBackend._check_response_for_error(response)\\n File \"/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 259, in _check_response_for_error\\n raise DatabaseError(response.status.errorMessage)\\nsqlalchemy.exc.DatabaseError: (databricks.sql.exc.DatabaseError) Invalid SessionHandle: SessionHandle [ee17e137-26ea-4677-a7c3-69e81be048bc]\\
The informations of my cluster : 14.2 (includes Apache Spark 3.5.0, Scala 2.12)I think it's because the session created is expired, but I don't konw how to make a new one without re-creating a serving endpoint. The goal of this chatbot is anytime when we launch, we can use directly. But with the risk of session timeout, I can't make this chatbot work normally. I didn't find any solutions which can help with my issue. Can anyone help ?