ClickHouse / clickhouse-java

ClickHouse Java Clients & JDBC Driver
https://clickhouse.com
Apache License 2.0
1.44k stars 529 forks source link

java.lang.IllegalArgumentException: boolean(67) should between 0 and 1 inclusive of both values #1143

Open fabiencelier opened 1 year ago

fabiencelier commented 1 year ago

Hello,

We have a issue similar to https://github.com/ClickHouse/clickhouse-jdbc/issues/902 with the latest version of ClickHouse client 0.3.2-patch11:

Caused by: java.lang.IllegalArgumentException: boolean(67) should between 0 and 1 inclusive of both values
        at com.clickhouse.client.ClickHouseChecker.newException(ClickHouseChecker.java:19)
        at com.clickhouse.client.ClickHouseChecker.between(ClickHouseChecker.java:90)
        at com.clickhouse.client.data.BinaryStreamUtils.readBoolean(BinaryStreamUtils.java:280)
        at com.clickhouse.client.data.BinaryStreamUtils.readNull(BinaryStreamUtils.java:562)
        at com.clickhouse.client.data.ClickHouseRowBinaryProcessor$MappedFunctions.deserialize(ClickHouseRowBinaryProcessor.java:455)
        at com.clickhouse.client.data.ClickHouseRowBinaryProcessor.readAndFill(ClickHouseRowBinaryProcessor.java:496)
        at com.clickhouse.client.ClickHouseDataProcessor$1.next(ClickHouseDataProcessor.java:122)
        at com.clickhouse.client.ClickHouseDataProcessor$1.next(ClickHouseDataProcessor.java:108)
...

This happens when reading a large query result, unfortunately we can't reproduce with a small dataset example (when splitting the query in 2 parts, both parts can be read successfully).

By debugging I found that it can happen on different fields, not one in particular. For instance one of the field which causes this issue is of type Nullable String which contains only 5 dictinct values, all non null.

Is this a known issue ? Is there anything we can do to avoid this bug or any information we can give you to help fix it ?

zhicwu commented 1 year ago

Hi @fabiencelier, sorry for the inconvenience at your end. It looks like deserialization error. Would you mind to share the table structure? You may also try nightly build to see if the issue has been addressed or not.

fabiencelier commented 1 year ago

The query looks like this:

SELECT
  T1.field0 AS `field_0`,
  T1.field1 AS `field_0`,
  ...
  T21.field_n AS `field_146`, -- we have 146 fields
  SUM(Tk.field_1) as `SUM_1`, 
  ...
  SUM(Tk.field_5) as `SUM_5`, -- we have 5 sums
  COUNT(*) as `count`,
FROM
  `my_database`.`table-1` AS T1
  LEFT OUTER JOIN `my_database`.`table_2` AS T2 ON (T1.key_field = T2.key_field)
  ... -- we have 21 joined tables
GROUP BY
  T1.field0
  T1.field1
  ... -- all the fields in the select

This is a big query but I don't reproduce the issue on smaller queries so far.

I have tried to split it into 2 queries with something like WHERE field1 = a in the first part and WHERE field1 != a in the other part but then the query does not fail.

I don't think I can easily share the structure of the 20 tables / 140 fields

zhicwu commented 1 year ago

To share the table structure you can use below query and share structure of the temp_table.

drop table if exists temp_table;
create table temp_table engine=Memory as
select * from ( <your query> ) where 0;

Apart from that, could you use nightly build and see if your query works or not? Apart from deserialization issue on client side(e.g. JDBC driver issue?), it may also relate to server(e.g. server killed the query exauhsting resource limit, like #976).

Daesgar commented 1 year ago

I'm finding a similar deserialization issue, that may be related to this. I did not find more info about it, so sorry guys if this has been discussed before.

The issue arises when dealing with Nested / Tuple types in a query. For example, I have the following table definition:

set flatten_nested = 0; -- This is needed for my use case
create table issue (
    a_tuple Tuple(
        field_a Nullable(String),
        field_b Nullable(String)
    ),
    a_string String,
    a_nested Nested(
        field_a Nullable(String),
        field_b Nullable(String)
    )
) engine = MergeTree() order by tuple();

Then, I proceed to insert a single row:

insert into issue
values
(
    ('a_value', 'another_value'),
    'another_one',
    [
        ('a_value', 'another_value')
    ]
);

Now, if I proceed to issue this statement:

select a_tuple, a_string, a_nested from issue;

The query returns appropriately. However, if I change the order of the elements:

select a_nested, a_tuple, a_string from issue; or select a_nested, a_string, a_tuple from issue; or select a_nested, a_tuple from issue;

Then the following exception is thrown:

...
Caused by: java.sql.SQLException: java.io.IOException: Failed to read boolean value, expect 0 (false) or 1 (true) but we got: 13
    at com.clickhouse.jdbc.SqlExceptionUtils.create(SqlExceptionUtils.java:45)
    at com.clickhouse.jdbc.SqlExceptionUtils.handle(SqlExceptionUtils.java:90)
    at com.clickhouse.jdbc.ClickHouseResultSet.next(ClickHouseResultSet.java:716)
    at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCResultSetImpl.next(JDBCResultSetImpl.java:272)
    at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCResultSetImpl.nextRow(JDBCResultSetImpl.java:180)
    ... 13 more
Caused by: java.io.IOException: Failed to read boolean value, expect 0 (false) or 1 (true) but we got: 13
    at com.clickhouse.data.ClickHouseInputStream.readBoolean(ClickHouseInputStream.java:724)
    at com.clickhouse.data.format.BinaryDataProcessor$NullableDeserializer.deserialize(BinaryDataProcessor.java:120)
    at com.clickhouse.data.format.ClickHouseRowBinaryProcessor$TupleDeserializer.deserialize(ClickHouseRowBinaryProcessor.java:204)
    at com.clickhouse.data.format.ClickHouseRowBinaryProcessor.readAndFill(ClickHouseRowBinaryProcessor.java:249)
    at com.clickhouse.data.ClickHouseDataProcessor.nextRecord(ClickHouseDataProcessor.java:132)
    at com.clickhouse.data.ClickHouseDataProcessor.access$000(ClickHouseDataProcessor.java:23)
    at com.clickhouse.data.ClickHouseDataProcessor$RecordsIterator.next(ClickHouseDataProcessor.java:49)
    at com.clickhouse.data.ClickHouseDataProcessor$RecordsIterator.next(ClickHouseDataProcessor.java:35)
    at com.clickhouse.jdbc.ClickHouseResultSet.next(ClickHouseResultSet.java:714)
    ... 15 more

It happens when we have a Nested field followed by a Tuple in the query.

zhicwu commented 1 year ago

Hi @Daesgar, sorry for the inconvenience and thanks for the detailed inputs.

Please consider to upgrade to v0.4.1, as it fixed broken serde for Nested in #1242. You may run into similar issue if you're dealing with complex JSON object(with array of objects).

xifeng commented 1 year ago

Hi @fabiencelier seems max_execution_time is too small for huge query results. the buffer string in ClickHouseInputStream is 'Code: 159, e.displayText() = DB::Exception: Timeout exceeded: ...' thus boolean(67) represent ASCII 'C' in 'Code: 159'

@zhicwu so maybe we should check query result is success before deserializing

we can reproduce by settings max_execution_time to a small value eg. 1

fabiencelier commented 1 year ago

It might have been related to the query failing because of timeout, we solve the issue by giving more resources to our ClickHouse

zhicwu commented 1 year ago

so maybe we should check query result is success before deserializing

Unfortunately ClickHouse server does not support HTTP trailer, meaning there's only one response code for the client. The error handling on server side should be improved too, as sometimes it just kill the connection without any error in response.

Understood the de-serialization error is confusing, but in most cases, it's a sign of server exception, and in general, we should look into system.query_log or server log to investigate the root cause. Let me add more information like query_id, session_id and server display name into the error message so that it's easier to troubleshoot.