ClickHouse / clickhouse-java

ClickHouse Java Clients & JDBC Driver
https://clickhouse.com
Apache License 2.0
1.44k stars 533 forks source link

The Roaring64NavigableMap result is incorrect when the groupBitmapState query data exceeds 32 #1157

Closed chenyongyin closed 1 year ago

chenyongyin commented 1 year ago

clickhouse version:22.5.1.2079 clickhouse-jdbc version:0.3.2-patch11 description:The Roaring64NavigableMap result is incorrect when the groupBitmapState query data exceeds 32 test code: ` @Test public void queryAssociatedMemberData() throws SQLException, IOException { ClickHouseConnection conn = dataSource.getConnection(); ClickHouseStatement statement = conn.createStatement(); ClickHouseRowBinaryInputStream clickHouseRowBinaryInputStream = statement.executeQueryClickhouseRowBinaryStream("SELECT groupBitmapState (submit_id) FROM ( select submit_id FROM test_table order by submit_id desc LIMIT 32 )"); ClickHouseBitmap bit = clickHouseRowBinaryInputStream.readBitmap(ClickHouseDataType.UInt64); Roaring64NavigableMap obj = (Roaring64NavigableMap) bit.unwrap(); System.out.println(Arrays.toString(obj.toArray())); }

when limit size is 32, result is:[1606590895823200472, 1606691424725311939, 1606691788451160908, 1606709496383218333, 1606709577899516635, 1606710989052126076, 1606711521183474914, 1606714294935432090, 1606715740808490795, 1606729810769027918, 1606730451344106715, 1606735095688536138, 1606736775238853812, 1606739509690182584, 1606742356490134304, 1606745835187153895, 1606748346459562510, 1606750726815164487, 1606751311530502610, 1606752875242532318, 1606756048267060673, 1606756720161007451, 1606757355598063872, 1606757623022692793, 1606759117084110176, 1606759266841734584, 1606765610860752409, 1606765776758058643, 1606767030079005132, 1606768059373790338, 1606770024845617525, 1606771374245489977]

when limit size is 33, result is:[1631669849934994306, 2856367473477230554, 3001608564060395386, 5378383266931218591, 5522779926334740745, 5955125491367615123, 6387189583554485297, 7323656827329515531, 7611887203640611067, 7685352173560337970, 9125096681171785266, -8960796474668017950, -8513067641551020001, -8168725888186770943, -8096386821307033182, -8096386817863509545, -7639654086493667323, -7591702185690850362, -7236791928276742137, -7086454602369395884, -5862601404155752052, -5501187536429049646, -5357072344574125668, -5286422127437212713, -5285014753182803549, -5214083059085274179, -4349391929157937477, -2404118365865566778, -1971491325658849539, -1755600018379173865, -962403534713185088, -674454633911479420, -602397039772888124]

`

leonirvanel commented 1 year ago

@chenyongyin 你好,我也碰到类似问题,请问下怎么解决

qwemicheal commented 1 year ago

I think before 0.41 fix this. You could manually reverse the highToBitmap using code like the following.

reverseHighInt(Roaring64NavigableMap map) {
        Long cardinality = map.getLongCardinality();
        if (cardinality>32){
        Field highToBitmapField = Roaring64NavigableMap.class.getDeclaredField("highToBitmap");
        AccessibleObject.setAccessible(new AccessibleObject[]{highToBitmapField}, true);
        NavigableMap<Integer, BitmapDataProvider> highToBitmap =
                (NavigableMap<Integer, BitmapDataProvider>) highToBitmapField.get(map);
        Set<Integer> oldKeys = new HashSet<>(highToBitmap.keySet());
        NavigableMap<Integer, BitmapDataProvider> keyThatNeedReverse = new TreeMap<>();
        NavigableSet<Integer> keyThatNeedRemove = new TreeSet<>();
        oldKeys.forEach(key -> {
            BitmapDataProvider provider = highToBitmap.get(key);
            keyThatNeedRemove.add(key);
            int high = Integer.reverseBytes(key);
            keyThatNeedReverse.put(high,provider);
        });
        for(Integer key: keyThatNeedRemove){
            highToBitmap.remove(key);
        }
        for (Integer reversedKey : keyThatNeedReverse.keySet()) {
            highToBitmap.put(reversedKey, keyThatNeedReverse.get(reversedKey));
        }
    }
}
zhicwu commented 1 year ago

This has been fixed in v0.4.1, but you'll have to set Roaring64NavigableMap.SERIALIZATION_MODE to Roaring64NavigableMap.SERIALIZATION_MODE_PORTABLE at least once before serialization happens. It's probably better to drop RoaringBitmap dependency and move tailored code into the Java client for ease of use and better performance.

chenyongyin commented 1 year ago

before this i have tried to set Roaring64NavigableMap.SERIALIZATION_MODE = 1,but it didn’t work。I guess it's because clickhouse serializes the result differently, because the nbHighs value in the deserializePortable method is -7240854706849841152,The correct value should be 33

陈永银 @.***

 

------------------ 原始邮件 ------------------ 发件人: "ClickHouse/clickhouse-java" @.>; 发送时间: 2023年3月16日(星期四) 晚上7:46 @.>; @.**@.>; 主题: Re: [ClickHouse/clickhouse-java] The Roaring64NavigableMap result is incorrect when the groupBitmapState query data exceeds 32 (Issue #1157)

This has been fixed in v0.4.1, but you'll have to set Roaring64NavigableMap.SERIALIZATION_MODE to Roaring64NavigableMap.SERIALIZATION_MODE_PORTABLE at least once before serialization happens. It's probably better to drop RoaringBitmap dependency and move tailored code into the Java client for ease of use and better performance.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

zhicwu commented 1 year ago

Please feel free to re-open the issue if v0.4.1 does not work for you. If you're looking for an working example, you may check the test at here.

chenyongyin commented 1 year ago

sorry,I don't have set Roaring64NavigableMap.SERIALIZATION_MODE = 1 in v0.4.1,I'll try again later。

陈永银 @.***

 

------------------ 原始邮件 ------------------ 发件人: "ClickHouse/clickhouse-java" @.>; 发送时间: 2023年3月16日(星期四) 晚上7:46 @.>; @.**@.>; 主题: Re: [ClickHouse/clickhouse-java] The Roaring64NavigableMap result is incorrect when the groupBitmapState query data exceeds 32 (Issue #1157)

This has been fixed in v0.4.1, but you'll have to set Roaring64NavigableMap.SERIALIZATION_MODE to Roaring64NavigableMap.SERIALIZATION_MODE_PORTABLE at least once before serialization happens. It's probably better to drop RoaringBitmap dependency and move tailored code into the Java client for ease of use and better performance.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>