exasol / virtual-schema-common-jdbc

Common module for JDBC-based access from Virtual Schemas
MIT License
0 stars 1 forks source link

Ensure all connections (esp. JDBC) are closed on failure #151

Closed ckunki closed 11 months ago

ckunki commented 1 year ago

Ticket mentioned as "IntRef" reported that when user cancels an SQL query in Exasol database the resulting Hive query in VSHIVE does not get canceled.

The current ticket therefore requests to inspect and if possible to improve the connection handling in VSCJDBC. See also ticket https://github.com/exasol/hive-virtual-schema/issues/39 for improved connection handling in VSHIVE.

If possible then VSCJDBC should ensure that especially in the case of an exception VSCJDBC closes all connections.

Acceptance criteria

  1. Connection handling in VSCJDBC is analysed and result of analysis is attached or linked to the current ticket
  2. If possible connection handling is improved

See for example ImportIntoTemporaryTableQueryRewriter.java#L75

private String createColumnsDescriptionFromQuery(final String query) throws SQLException {
    final ColumnMetadataReader columnMetadataReader = this.remoteMetadataReader.getColumnMetadataReader();
    final ResultSetMetadataReader resultSetMetadataReader = new ResultSetMetadataReader(
            this.connectionFactory.getConnection(), columnMetadataReader);
    final String columnsDescription = resultSetMetadataReader.describeColumns(query);
    LOGGER.finer(() -> "Import columns: " + columnsDescription);
    return columnsDescription;
}
### Tasks
- [ ] https://github.com/exasol/hive-virtual-schema/issues/41
ckunki commented 1 year ago

In my impression the factory is also used for caching and reusing a connection with the intention to improve performance by not creating a new connection every time. From that closing the connection in case of an exception should rather be ensured on a higher level, e.g. in JDBCAdapter.pushdown().

ckunki commented 1 year ago

Alignment with @kaklakariada:

Shmuma commented 12 months ago

Couple of issues in connection management have found:

  1. JDBC connections are AutoClosable interface, so they will be closed on try-with-resource block. But we're not using this pattern anywhere, so connections will be closed only on garbage collection (because we're not closing them explicitly as well)
  2. RemoteConnectionFactory was created inside the function: https://github.com/exasol/virtual-schema-common-jdbc/blob/main/src/main/java/com/exasol/adapter/jdbc/JDBCAdapter.java#L161, which is not how factories should be created. Because of that, connections are automatically closed after every operation (query or schema refresh) and being reestablished again. It also might lead to connection leak in some cases (if reference is kept after exception, for example)
  3. on exception we're not closing connection explicitly
Shmuma commented 12 months ago

But before fixing all this, hive-virtual-schema has to be updated to the latest libraries version

Shmuma commented 12 months ago

Optimizing connection establishment doesn't have much sense, as we're restarting whole java process on every request (at least jar file is being reloaded from bucketfs)