fhsgoncalves / metabase-sparksql-databricks-driver

GNU Affero General Public License v3.0
9 stars 2 forks source link

Metabase Spark SQL connection to Databricks throughs Invalid SessionHandle Error #2

Open chandanmanjunath opened 4 years ago

chandanmanjunath commented 4 years ago

Hi All,

We have established a connection from Metabase to Databricks through (sprak sql databricks connector).Now we could able to see the database tables and query the data from metabase UI.

But what we see is ,in between there is below error that frequently pops up(Query used: *select from table_name limit 100**):

[Simba]SparkJDBCDriver ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Invalid SessionHandle: SessionHandle [6baeb84f-c991-4c02-9729-69b1bfb2c613]:37:36, org.apache.hive.service.cli.session.SessionManager:getSession:SessionManager.java:287, org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:274, org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:436, org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1313, org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1298, org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39, org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39, org.apache.thrift.server.TServlet:doPost:TServlet.java:83, org.apache.hive.service.cli.thrift.ThriftHttpServlet:doPost:ThriftHttpServlet.java:184, javax.servlet.http.HttpServlet:service:HttpServlet.java:707, javax.servlet.http.HttpServlet:service:HttpServlet.java:790, org.eclipse.jetty.servlet.ServletHolder:handle:ServletHolder.java:848, org.eclipse.jetty.servlet.ServletHandler:doHandle:ServletHandler.java:584, org.eclipse.jetty.server.handler.ScopedHandler:handle:ScopedHandler.java:143, org.eclipse.jetty.security.SecurityHandler:handle:SecurityHandler.java:548, org.eclipse.jetty.server.session.SessionHandler:doHandle:SessionHandler.java:226, org.eclipse.jetty.server.handler.ContextHandler:doHandle:ContextHandler.java:1180, org.eclipse.jetty.servlet.ServletHandler:doScope:ServletHandler.java:512, org.eclipse.jetty.server.session.SessionHandler:doScope:SessionHandler.java:185, org.eclipse.jetty.server.handler.ContextHandler:doScope:ContextHandler.java:1112, org.eclipse.jetty.server.handler.ScopedHandler:handle:ScopedHandler.java:141, org.eclipse.jetty.server.handler.HandlerWrapper:handle:HandlerWrapper.java:134, org.eclipse.jetty.server.Server:handle:Server.java:534, org.eclipse.jetty.server.

Once we restart metabase its resolved but error pops up again in between data queries.Please let us know how we can resolve this.

relferreira commented 4 years ago

I'm with precisely the same problem. Did you find a solution @chandanmanjunath?

fhsgoncalves commented 4 years ago

Hey guys, I'm not using databricks + metabase anymore, so I'm not able to take a look on this. I'd try to update the metabase, and rebuild this driver to be used with the new version.

But this may not be enough, if anyone is able to figure it out, please, share it here, and feel free to submit a PR or fork the project :smile:

relferreira commented 4 years ago

Hello, @fhsgoncalves I've created a PR in the ifood repository:

https://github.com/ifood/metabase-sparksql-databricks-driver/pull/2

Should I open to this repo too?

fhsgoncalves commented 4 years ago

@relferreira thank you for working on that!

And yes, it would be good if you open the PR here, because I'm not working at iFood anymore :)

KJGangarsha commented 3 years ago

Hello @relferreira , @chandanmanjunath I'm facing the same issue. If I leave my application idle for few minutes, and then try to call the connection it is failing, getting the error TStatus(statusCode:ERROR_STATUS, errorMessage:INVALID_STATE: Invalid SessionHandle: SessionHandle [1b113b80-a79b-49c7-99c8-03e300beacfa]. Once it is restarted it is working fine. I don't want to restart it all the time. How do I keep the connection alive and resolve this error?

susodapop commented 2 years ago

I ran into this as well while making a CLI tool that queries Databricks SQL. It uses the databricks-sql-connector (pip install databricks-sql-connector) for connections. I found that after ten minutes of inactivity, the connection would expire at the server. After this, it was impossible to even call the .close() method on the connection. Everything would raise a RequestError (since I'm interacting over HTTP). It was actually the text of the RequestError that mentioned the InvalidState verbiage.

There were two ways to fix this:

  1. Implement some keep-alive logic.
  2. Catch the RequestError error and reconnect when needed

I went for the second choice because keep-alive's make load-management more difficult for administrators. In my querying logic, I reimplemented my close_connection method like this:

    def close_connection(self):
        """Close any open connection and remove the `conn` attribute"""

        if not hasattr(self, "conn"):
            return

        try:
            self.conn.close()
        except AttributeError as e:
            # This happens when self.conn is None. None does not have an attribute named 'close'
            logger.debug("There is no active connection to close.")
        except RequestError as e:
            message = "The connection is no longer active and will be recycled. It was probably was timed-out by SQL gateway"
            # This notifies the user that the connection was timed-out
            click.echo(message)
            logger.debug(f"{message}: {e}")
        finally:
            delattr(self, "conn")