Open findepi opened 2 years ago
I think JDBC requires Connection
to be thread-safe, but couldn't find this in writing.
java.sql.Connection#abort
mentons threads but that's quite specific method
The "JDBC™ 4.1 Specification" says
Each Connection object can create multiple Statement objects that may be used concurrently by the program. This is demonstrated in CODE EXAMPLE 13-2.
however it's followed by a code example that has no threads.
Internet wisdom (https://stackoverflow.com/questions/1531073/is-java-sql-connection-thread-safe#comment1386784_1531103) suggests some other databases have problems with thread-safety in JDBC.
TrinoConnection
makes heavy use of atomic, trying to provide the thread-safety.
cc @aalbu @leniartek
I don't think the word "concurrently" in the specification there means "at the same time by multiple threads", since the specification doesn't mention threads anywhere else. Other drives like Oracle will synchronize access to the connection.
ince the specification doesn't mention threads anywhere else.
this i noticed, hence comment above.
Other drives like Oracle will synchronize access to the connection.
is this in support of what actually?
that the Connection should be thread-safe?
Current implementation of Connection looks thread-safe, and this dates back to the first commit introducing the class. It could be interpreted as if it was OK to share Connection between threads, but that leads to the problem described here. If the Connection is not supposed to be thread-safe, can we remove would-be-unnecessary use of AtomicReferences and ConcurrentHashMaps in the TrinoConnection?
It seems get the same case in python client.
File "/tmp/dolphinscheduler/exec/process/82/1442/6451842/18620689/py_1442_6451842_18620689.command", line 70, in Conca_X
X=cur.fetchall()
File "/opt/applications/conda/lib/python3.9/site-packages/trino/dbapi.py", line 558, in fetchall
return list(self.genall())
File "/opt/applications/conda/lib/python3.9/site-packages/trino/client.py", line 509, in __iter__
rows = self._query.fetch()
File "/opt/applications/conda/lib/python3.9/site-packages/trino/client.py", line 677, in fetch
status = self._request.process(response)
File "/opt/applications/conda/lib/python3.9/site-packages/trino/client.py", line 440, in process
raise self._process_error(response["error"], response.get("id"))
trino.exceptions.TrinoQueryError: TrinoQueryError(type=INTERNAL_ERROR, name=GENERIC_INTERNAL_ERROR, message="Query already begun: Optional[20220712_234817_33615_a6hf9] while starting query 20220712_234819_33617_a6hf9", query_id=20220712_234819_33617_a6hf9)
@ElapsedSoul since this is client-specific problem, please file a separate bug report at https://github.com/trinodb/trino-python-client
I get the same error using Java with the JDBC Driver version 402. It happens fairly consistently when executing the follow code:
import java.sql.*;
public class main {
public static void main(String[] args) throws SQLException {
// loop to make it easier to see the error
for (int i = 0; i < 10; i++) {
System.out.println("attempts #" + i);
connectToTrinoHiveDocker();
}
}
private static void connectToTrinoHiveDocker() throws SQLException {
// Running Trino locally with hive as described here: https://github.com/bitsondatadev/trino-getting-started/tree/main/hive/trino-minio
String url = "jdbc:trino://localhost:8443/minio?SSL=true&SSLVerification=NONE";
Connection conn = DriverManager.getConnection(url, "testuser", "testuser");
// autoCommit=True works, but false (which i need) causes failure
conn.setAutoCommit(false);
// This seems to execute a query that causes the follow preparedstmt to error with "query already begun"
ResultSet result = conn.getMetaData().getSchemas();
PreparedStatement stmt = conn.prepareStatement("select * from minio.tiny.customer limit 2");
ResultSet rs = stmt.executeQuery();
conn.close();
}
}
Are we not expected to use autoCommit=false
and executing several select statements? The same code works when connecting to postgres, mysql and oracle (not using trino, since the Hive connector on trino is the only that allows autoCommit=false)
@toreluntang can you check whether this is still reproducible when you wrap ResultSet result = conn.getMetaData().getSchemas()
with try-with-resources?
Sorry if I misunderstood, feel free to correct me but: If i alter the example above and use this instead:
try (ResultSet result = conn.getMetaData().getSchemas()) {
System.out.println(result);
} catch (Exception e) {
e.printStackTrace();
}
Another error shows up when i create the prepared statement for select * from ...
Exception in thread "main" java.sql.SQLException: Query failed (#20221113_071435_00199_tks2h): Current transaction is aborted, commands ignored until end of transaction block
Current transaction is aborted, commands ignored until end of transaction block
to fix this one, you'd need to consume the result set fully
try (ResultSet result = conn.getMetaData().getSchemas()) {
while (result.next()) { }
}
not sure whether this should be this way though ...
steps to reproduce 1
TrinoConnection
)TrinoStatement
)steps to reproduce 2
https://github.com/trinodb/trino/issues/10788#issuecomment-1311544594 (current version, as of 2022-11-14 15:06 CET)
observed
for example: https://github.com/trinodb/trino/blob/8e9da982f95a531b52a72c9cda39d96b00d126ce/plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/SemiTransactionalHiveMetastore.java#L1267-L1271
notes
This is because
io.trino.jdbc.TrinoConnection#transactionId
is a shared mutable state. It's populated for non-SELECT queries here https://github.com/trinodb/trino/blob/a40038879399a6c4aedfb97fe9e5cd2bf1d05e21/client/trino-jdbc/src/main/java/io/trino/jdbc/TrinoStatement.java#L275Thus, the second statement, may see the implicit, autocommit transaction created for the first statement.