ClickHouse / clickhouse-java

ClickHouse Java Clients & JDBC Driver
https://clickhouse.com
Apache License 2.0
1.43k stars 526 forks source link

The driver incorrectly parses array with tuple of two strings #781

Closed n5a5 closed 2 years ago

n5a5 commented 2 years ago

Suppose we execute the following query in DBeaver:

SELECT CAST('[(\'a\',\'b\')]' AS Array(Tuple(String, String)))

Instead of an array with one tuple, we get this:

image

I started investigating, and it turned out that this is not a DBeaver's bug but driver's. I think something goes wrong inside ClickHouseArrayUtil.parseArray().

You can reproduce the issue with this snippet:

DataSource dataSource = new ClickHouseDataSource(url);

try (Connection conn = dataSource.getConnection()) {
    try (Statement stmt = conn.createStatement()) {
        try (ResultSet resultSet = stmt.executeQuery("SELECT CAST('[(\\'a\\',\\'b\\')]' AS Array(Tuple(String, String)))")) {
            while (resultSet.next()) {
                Array array = resultSet.getArray(1);
                Object[] items = (Object[]) array.getArray();
                System.out.println(Arrays.asList(items));
            }
        }
    }
}

I used ClickHouse 20.4.4.18 with clickhouse-jdbc 0.2.6. My colleague reproduced the same issue on the same server with the 0.3.1-patch.

zhicwu commented 2 years ago

Thanks for reporting the issue. Before 0.3.2(up to 0.3.1-patch), the driver uses a text-based format for serialization and deserialization. Starting from 0.3.2, it switched to RowBinary by default.

Just tried to driver 0.3.1-patch on DBeaver Version 21.3.1.202112111946, it ended up with below exception. While using 0.3.2 test build, I got 0 [Object](because the value is array of List, which represent a tuple). DataGrips seems working fine as it uses getString to retrieve value...

org.jkiss.dbeaver.model.exec.DBCException: Can't extract array data from JDBC array
    at org.jkiss.dbeaver.model.impl.jdbc.data.JDBCCollection.makeCollectionFromArray(JDBCCollection.java:267)
    at org.jkiss.dbeaver.model.impl.jdbc.data.handlers.JDBCArrayValueHandler.getValueFromObject(JDBCArrayValueHandler.java:64)
    at org.jkiss.dbeaver.model.impl.jdbc.data.handlers.JDBCArrayValueHandler.getValueFromObject(JDBCArrayValueHandler.java:1)
    at org.jkiss.dbeaver.model.impl.jdbc.data.handlers.JDBCComplexValueHandler.fetchColumnValue(JDBCComplexValueHandler.java:50)
    at org.jkiss.dbeaver.model.impl.jdbc.data.handlers.JDBCAbstractValueHandler.fetchValueObject(JDBCAbstractValueHandler.java:49)
    at org.jkiss.dbeaver.ui.controls.resultset.ResultSetDataReceiver.fetchRow(ResultSetDataReceiver.java:125)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.fetchQueryData(SQLQueryJob.java:721)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.executeStatement(SQLQueryJob.java:544)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.lambda$0(SQLQueryJob.java:444)
    at org.jkiss.dbeaver.model.exec.DBExecUtils.tryExecuteRecover(DBExecUtils.java:171)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.executeSingleQuery(SQLQueryJob.java:431)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.extractData(SQLQueryJob.java:816)
    at org.jkiss.dbeaver.ui.editors.sql.SQLEditor$QueryResultsContainer.readData(SQLEditor.java:3453)
    at org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.lambda$0(ResultSetJobDataRead.java:118)
    at org.jkiss.dbeaver.model.exec.DBExecUtils.tryExecuteRecover(DBExecUtils.java:171)
    at org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.run(ResultSetJobDataRead.java:116)
    at org.jkiss.dbeaver.ui.controls.resultset.ResultSetViewer$ResultSetDataPumpJob.run(ResultSetViewer.java:4810)
    at org.jkiss.dbeaver.model.runtime.AbstractJob.run(AbstractJob.java:105)
    at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)
Caused by: org.jkiss.dbeaver.model.exec.DBCException: SQL Error
    at org.jkiss.dbeaver.model.impl.jdbc.data.JDBCCollection.makeCollectionFromArray(JDBCCollection.java:251)
    ... 18 more
Caused by: java.sql.SQLFeatureNotSupportedException
    at ru.yandex.clickhouse.ClickHouseArray.getResultSet(ClickHouseArray.java:65)
    at org.jkiss.dbeaver.model.impl.jdbc.data.JDBCCollection.makeCollectionFromResultSet(JDBCCollection.java:367)
    at org.jkiss.dbeaver.model.impl.jdbc.data.JDBCCollection.makeCollectionFromArray(JDBCCollection.java:249)
    ... 18 more
n5a5 commented 2 years ago

@ShadelessFox Ping

ShadelessFox commented 2 years ago

DataGrips seems working fine as it uses getString to retrieve value...

We don't use getString (without further processing) because we want to display its elements.

In DBeaver, Clickhouse uses a generic driver. At this point, it's probably would be better to make a separate extension to support Clickhouse-specific data. For example, our generic struct handler does not expect a value with the type Types.STRUCT to be java.util.List (in the case of Tuple, I guess Map it would be java.util.Map?).

I'm not sure if we should do it right now - the JDBC driver version is not even a release one.

zhicwu commented 2 years ago

Thanks @ShadelessFox and @akilovich, please refer to my comments below.

We don't use getString (without further processing) because we want to display its elements.

Agreed. And it's actually part of the reasons why I like your product better :) The behavior can be changed by setting connection property typeMappings to something like Tuple=java.lang.String.

In DBeaver, Clickhouse uses a generic driver. At this point, it's probably would be better to make a separate extension to support Clickhouse-specific data.

dbeaver/dbeaver#14255 is a similar but more generic topic for discussing nested type support. I hope the new driver can act as a generic one, which is why fake transaction and standard synchronous UPDATE/DELETE support were added. Perhaps you guys can consider a generic implementation in DBeaver for visualizing complex data structure, not only for ClickHouse but also some other databases may or may not exist today? And hopefully the same mechanism will work for the upcoming new data type mentioned in ClickHouse/ClickHouse#23516 (my take is nested maps).

For example, our generic struct handler does not expect a value with the type Types.STRUCT to be java.util.List (in the case of Tuple, I guess Map it would be java.util.Map?).

Are you expecting Object[] for STRUCT? As of now, the mappings are as follows. And I'm happy to make changes as needed.

ClickHouse Data Type JDBC Data Type Java Data Type Remark
Array ARRAY Primitive or object array
Tuple STRUCT Object[] when wrapperObject is set to true, the driver will use java.sql.Struct to wrap the value, which is Object[]
Nested STRUCT Object[][]
Map OTHER Map

I'm not sure if we should do it right now - the JDBC driver version is not even a release one.

The implementation stays the same if no other suggestion/need for further changes. Let me release 0.3.2 tomorrow :)

zhicwu commented 2 years ago

Along with 0.3.2 release, I added integration test to demonstrate values returned from the driver. DBeaver rendered new List[] { Arrays.asList("a", "b") } as ['[[a, b]]'], which is incorrect.

Let me close the issue here and we can continue discussion in dbeaver/dbeaver#14255.