Closed FrankChen021 closed 1 year ago
Hi @FrankChen021, thanks for sharing your findings. Does increasing request_chunk_size
or maybe write_buffer_size
help to achieve lower latency? Main reasons for the changes are: 1) less memory footprints vs. slightly decreased performance(mainly for small queries); 2) align with other buffers, like the one in HttpURLConnection
; 3) async inserts.
Actually I started with 4096 as it's same in HttpURLConnection
, but later changed it to 8192 due to slightly increased performance'(mainly for select queries). And then it was rolled it back because I believe it's better for majority user cases for above reasons.
I didn't spend much time on fine-tuning writes, but I'll see what I can do. Please let me know if you have more clues.
Hello @zhicwu I made some further investigation to find out the root cause of this problem. It turned out that it was caused by some client side tools that it slowed down the process of flushing data to the socket.
Although the smaller chunk size has impact on the performance, according to our test(INSERT 1M rows each time and 32bytes for each row.) by using AB tool, 8Ki request size improves the latency for about 10% if we take the median response time to compare, which I think the improvment is trival
default
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 1 0.5 1 12
Processing: 1894 2885 760.2 2709 7730
Waiting: 1894 2885 760.2 2709 7730
Total: 1895 2886 760.2 2710 7731
Percentage of the requests served within a certain time (ms)
50% 2710
66% 3056
75% 3251
80% 3389
90% 3863
95% 4535
98% 5009
99% 5194
100% 7731 (longest request)
8Ki
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 1 13.6 1 427
Processing: 1799 2699 691.1 2467 5567
Waiting: 1799 2699 691.2 2467 5567
Total: 1800 2701 690.8 2470 5568
Percentage of the requests served within a certain time (ms)
50% 2470
66% 2816
75% 3100
80% 3255
90% 3696
95% 4217
98% 4582
99% 4900
100% 5568 (longest request)
16Ki
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 1 0.5 1 18
Processing: 1747 2611 677.0 2443 5615
Waiting: 1747 2611 677.0 2443 5615
Total: 1748 2612 677.0 2444 5616
Percentage of the requests served within a certain time (ms)
50% 2444
66% 2691
75% 2928
80% 3088
90% 3514
95% 4147
98% 4582
99% 4805
100% 5616 (longest request)
Thanks for the clarification. Have you tried request_chunk_size
? Actually we're using triple-buffer in http client, one in ClickHouseOutputStream, and two in HttpURLConnection. request_chunk_size
controls size of the data frame sent to ClickHouse, which you may want to tweak when inserting massive data. Apart from that, if the source is in compressed format, it's better to sent it directly to ClickHouse so that Java won't slow things down. In that case, you may use ClickHousePassThruSteam instead of client-side insert from outfile.
@FrankChen021, a few questions
After upgrading to 0.4 from 0.3.2, we found that the INSERT latency increased from the client side.
According to our investigation, we believed that the change of chunk size might be one of the reason.
On 0.3.2, the default buffer size is 8Ki,
however, on 0.4, it's changed to 4Ki
Since this parameter is not documented, most of the users don't know about it, decreasing this value has negative impact for most users. So why the default value is decreased on 0.4?