Open VladislavePolyakov opened 3 months ago
On latest version same story
ClickHouse client version 24.2.2.71 (official build). Connecting to localhost:9000 as user default. Code: 210. DB::NetException: Connection refused (localhost:9000). (NETWORK_ERROR)
information from Keeper befor start DDL
strange looking that Node count: 13, actualy we have only 3 nodes for Keeper and 4 for Database
echo mntr | nc localhost 9181
zk_version v24.2.2.71-stable-9293d361e72be9f6ccfd444d504e2137b2e837cf
zk_avg_latency 2
zk_max_latency 16
zk_min_latency 0
zk_packets_received 28
zk_packets_sent 28
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 13
zk_watch_count 1
zk_ephemerals_count 0
zk_approximate_data_size 2179
zk_key_arena_size 0
zk_latest_snapshot_size 0
zk_open_file_descriptor_count 75
zk_max_file_descriptor_count 500000
zk_followers 2
zk_synced_followers 2
echo ruok | nc localhost 9181
imok
echo srvr | nc localhost 9181
ClickHouse Keeper version: v24.2.2.71-stable-9293d361e72be9f6ccfd444d504e2137b2e837cf
Latency min/avg/max: 0/2/16
Received: 31
Sent: 31
Connections: 1
Outstanding: 0
Zxid: 0x68
Mode: leader
Node count: 13
echo stat | nc localhost 9181
ClickHouse Keeper version: v24.2.2.71-stable-9293d361e72be9f6ccfd444d504e2137b2e837cf
Clients:
127.0.0.1:45444(recved=0,sent=0)
10.189.218.11:45828(recved=33,sent=33)
Latency min/avg/max: 0/2/16
Received: 33
Sent: 33
Connections: 1
Outstanding: 0
Zxid: 0x70
Mode: leader
Node count: 13
echo conf | nc localhost 9181
server_id=1
enable_ipv6=true
tcp_port=9181
four_letter_word_allow_list=jmst,jmfp,jmep,jmdp,conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif,rqld,rclc,clrs,ftfl,ydld,pfev
max_requests_batch_size=100
min_session_timeout_ms=10000
session_timeout_ms=30000
operation_timeout_ms=10000
dead_session_check_period_ms=500
heart_beat_interval_ms=500
election_timeout_lower_bound_ms=1000
election_timeout_upper_bound_ms=2000
reserved_log_items=100000
snapshot_distance=100000
auto_forwarding=true
shutdown_timeout=5000
startup_timeout=180000
raft_logs_level=warning
snapshots_to_keep=3
rotate_log_storage_interval=100000
stale_log_gap=10000
fresh_log_gap=200
max_requests_batch_size=100
max_requests_batch_bytes_size=102400
max_flush_batch_size=1000
max_request_queue_size=100000
max_requests_quick_batch_size=100
quorum_reads=false
force_sync=true
compress_logs=false
compress_snapshots_with_zstd_format=true
configuration_change_tries_count=20
raft_limits_reconnect_limit=50
async_replication=false
log_storage_path=/var/lib/clickhouse/coordination/log
log_storage_disk=LocalLogDisk
snapshot_storage_path=/var/lib/clickhouse/coordination/snapshots
snapshot_storage_disk=LocalSnapshotDisk
echo cons | nc localhost 9181
127.0.0.1:45452(recved=0,sent=0,sid=0xffffffffffffffff,lop=NA,est=1711032588305,to=0,lzxid=0xffffffffffffffff,lresp=0,llat=0,minlat=0,avglat=0,maxlat=0)
10.189.218.11:45828(recved=37,sent=37,sid=0x0000000000000002,lop=List,est=1711032319539,to=30000,lcxid=0x000000000000000b,lzxid=0x0000000000000010,lresp=1711032319581,llat=0,minlat=0,avglat=2,maxlat=16)
echo envi | nc localhost 9181
Environment:
clickhouse.keeper.version=v24.2.2.71-stable-9293d361e72be9f6ccfd444d504e2137b2e837cf
host.name=ocs-clickhouse01.devel.pro
os.name=Linux
os.arch=x86_64
os.version=3.10.0-1160.105.1.el7.x86_64
cpu.count=6
user.name=
user.home=/home/clickhouse/
user.dir=/
user.tmp=/tmp/
echo isro | nc localhost 9181
rw
echo wchs | nc localhost 9181
1 connections watching 1 paths
Total watches:1
Sounds like the other server crashed when trying to create the table. There's a known problem that would cause that, fixed in https://github.com/ClickHouse/ClickHouse/pull/58934 (the change in ASTCreateQuery::clone()
).
strange looking that Node count: 13, actualy we have only 3 nodes for Keeper and 4 for Database
The first output says zk_znode_count
- that's number of znodes ("files"/"directories" in zookeeper), not number of servers. I'm guessing "Node count" in the other two outputs means znodes as well.
@al13n321 Michael thank you for answer. Do you have builded version? I would like to test this fix on out environmemt.
Build: https://s3.amazonaws.com/clickhouse-test-reports/58934/acccccbaf41e02d7e2d42662cb83d9799924b936/clickhouse_build_check/report.html (from "ClickHouse build check" in the list of checks in the PR).
@al13n321, is it available as docker container? When this is planned to be released?
Build: https://s3.amazonaws.com/clickhouse-test-reports/58934/acccccbaf41e02d7e2d42662cb83d9799924b936/clickhouse_build_check/report.html (from "ClickHouse build check" in the list of checks in the PR).
tested and it working thanks
More detailed tested MV with REFRESH and all migrations applied sucsessful, but on refresh got error.
SELECT *
FROM system.view_refreshes
Query id: 53bca73b-3f93-4c9c-8916-e13c3f2e5998
Row 1:
──────
database: default
view: applications_mv
status: Scheduled
last_refresh_result: Error
last_refresh_time: 2024-04-01 16:40:08
last_success_time: ᴺᵁᴸᴸ
duration_ms: 8105
next_refresh_time: 2024-04-01 16:50:00
remaining_dependencies: []
exception: Code: 48. DB::Exception: RENAME EXCHANGE is not supported. (NOT_IMPLEMENTED), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000cc4f3db
1. DB::Exception::Exception<>(int, FormatStringHelperImpl<>) @ 0x000000000762ac63
2. DB::DatabaseAtomic::renameTable(std::shared_ptr<DB::Context const>, String const&, DB::IDatabase&, String const&, bool, bool) @ 0x00000000101af8c8
3. DB::InterpreterRenameQuery::execute() @ 0x0000000010dea198
4. DB::StorageMaterializedView::exchangeTargetTable(DB::StorageID, std::shared_ptr<DB::Context const>) @ 0x000000001177497d
5. DB::RefreshTask::refreshTask() @ 0x00000000121707ce
6. DB::BackgroundSchedulePool::threadFunction() @ 0x000000000fdbfc2a
7. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const*)::$_0>(DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const*)::$_0&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000fdc0c33
8. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000ccf782d
9. start_thread @ 0x0000000000007ea5
10. clone @ 0x00000000000feb0d
(version 24.3.1.2153)
retry: 0
refresh_count: 0
progress: inf
elapsed: 0
read_rows: 469472
read_bytes: 37158348
total_rows: 0
total_bytes: 0
written_rows: 0
written_bytes: 0
result_rows: 0
result_bytes: 0
@al13n321 Could you recommend something about this error?
"RENAME EXCHANGE is not supported" means the Linux kernel is older than 3.15 (or Mac OS is older than 10.12), required for refreshable MV.
When creating MV got error and database down and can't start anymore.
Expectation is created MV and no down database.
How to reproduce
Stack trace: