ClickHouse / clickhouse-java

ClickHouse Java Clients & JDBC Driver
https://clickhouse.com
Apache License 2.0
1.45k stars 535 forks source link

[client-v2] Missing full exception causes for CREATE TABLE failures #1906

Closed alxhill closed 1 week ago

alxhill commented 1 week ago

We create tables from user-provided csvs, which may sometimes be invalid (e.g duplicate columns, missing values for some columns), and catch and parse exceptions from clickhouse to show a message to the user.

V1 Client Behaviour

Previously we got error messages depending on the issue with the underlying file:

Duplicate Column

The table structure cannot be extracted from a CSVWithNames format file. Error:
Code: 117. DB::Exception: Duplicate column name found while schema inference: "a". (INCORRECT_DATA) (version 24.7.3.42 (official build)).

Empty File

Code: 636. DB::Exception: The table structure cannot be extracted from a CSVWithNames format file. Error:
The table structure cannot be extracted from a CSVWithNames format file: the file is empty.
You can specify the structure manually. (CANNOT_EXTRACT_TABLE_STRUCTURE) (version 24.7.3.42 (official build))

Rows have different numbers of values

Code: 636. DB::Exception: The table structure cannot be extracted from a CSVWithNames format file. Error:
Code: 117. DB::Exception: Rows have different amount of values. (INCORRECT_DATA) (version 24.7.3.42 (official build)).
You can specify the structure manually: (in file/uri test_inputs/test_row_with_missing_columns.csv). (CANNOT_EXTRACT_TABLE_STRUCTURE) (version 24.7.3.42 (official build))

V2 Client Behaviour

It now returns only the first part of the error, so we can't figure out the root cause from the Exception. Here's a logged stack trace for an empty file:

com.clickhouse.client.api.ServerException: Code: 636. DB::Exception: The table structure cannot be extracted from a CSVWithNames format file. Error:
    at com.clickhouse.client.api.internal.HttpAPIClientHelper.readError(HttpAPIClientHelper.java:311) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.internal.HttpAPIClientHelper.executeRequest(HttpAPIClientHelper.java:357) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.lambda$query$11(Client.java:1568) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.runAsyncOperation(Client.java:1942) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.query(Client.java:1644) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
...

and here's one for rows have different numbers of values:

com.clickhouse.client.api.ServerException: Code: 636. DB::Exception: The table structure cannot be extracted from a CSVWithNames format file. Error:
    at com.clickhouse.client.api.internal.HttpAPIClientHelper.readError(HttpAPIClientHelper.java:311) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.internal.HttpAPIClientHelper.executeRequest(HttpAPIClientHelper.java:357) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.lambda$query$11(Client.java:1568) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.runAsyncOperation(Client.java:1942) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.query(Client.java:1644) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
...

As you can see, we can no longer distinguish between the two errors.

chernser commented 1 week ago

@alxhill thank you! I probably know where issue is...

chernser commented 1 week ago

@alxhill the issue is in how client read response from server in different formats.
As workaround I may suggest using query with format com.clickhouse.data.ClickHouseFormat#JSONEachRow In this case error message is completely parsed.

alxhill commented 1 week ago

If I add FORMAT JSONEachRow to the end of the query itself, the stack traces are still missing additional context but do have the message shown above. If I use settings.setFormat(ClickHouseFormat.JSONEachRow), I get the following stack trace:

com.clickhouse.client.api.ServerException: Code: 636. DB::Exception: <Unreadable error message>
    at com.clickhouse.client.api.internal.HttpAPIClientHelper.readError(HttpAPIClientHelper.java:318) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.internal.HttpAPIClientHelper.executeRequest(HttpAPIClientHelper.java:357) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.lambda$query$11(Client.java:1568) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.runAsyncOperation(Client.java:1942) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]
    at com.clickhouse.client.api.Client.query(Client.java:1644) ~[client-v2-0.7.1.jar:client-v2 0.7.1 (revision: 3667f7d)]

(that's unedited, the message is literally "<Unreadable error message>")

chernser commented 1 week ago

@alxhill new version with fix is release (0.7.1-patch). It should be soon be in Maven central.