Closed DanRoscigno closed 1 year ago
I agree and ENGINE = ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
could be even simplified to ENGINE = ReplicatedMergeTree
if these macros are there by default
<default_replica_path>/clickhouse/tables/{shard}/{database}/{table}</default_replica_path>
<default_replica_name>{replica}</default_replica_name>
(but I don't see them in the default ClickHouse server configuration file)
side note: I think it should have {cluster}
<default_replica_path>/clickhouse/{cluster}/tables/{shard}/{database}/{table}</default_replica_path>
side note: I think it should have
{cluster}
<default_replica_path>/clickhouse/{cluster}/tables/{shard}/{database}/{table}</default_replica_path>
Without {uuid} @den-crane ?
I agree and
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
could be even simplified toENGINE = ReplicatedMergeTree
if these macros are there by default<default_replica_path>/clickhouse/tables/{shard}/{database}/{table}</default_replica_path> <default_replica_name>{replica}</default_replica_name>
(but I don't see them in the default ClickHouse server configuration file)
The default comes from here @tom-clickhouse :
find . -type f | xargs grep default_replica_path|grep -v "^\.\/docs"
./src/Storages/StorageReplicatedMergeTree.cpp: return config.getString("default_replica_path", "/clickhouse/tables/{uuid}/{shard}");
Without {uuid} @den-crane ?
It's more complicated, I don't know how to solve it in create table
page.
{uuid}
can be used only with on cluster
, of if you use a special create table ... uuid
syntax.
It's huge work to make create table
page to reflect the current state.
First of all, there's no magic default path that will always work for everyone (I tried, but it's just impossible). And yes, logic around paths in zk became way too complicated.
ReplicatedReplacingMergeTree('/clickhouse/tables/{layer}-{shard}/table_name', '{replica}', ver)
{layer}
is something outdated from Yandex.Metrica, maybe we can update the example.
ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
Yes, it's current default and it should work perfectly with Replicated
database. It also works with ON CLUSTER queries, but it's not convenient to use when you need to add new replicas or restore replicas after disk failure.
The {uuid}
macro is tricky. It takes UUID from the UUID '...'
clause from the CREATE TABLE
statement. This clause is optional, so if no UUID is specified in CREATE TABLE
query (as recommended), then random UUID is generated and this clause is added automatically. The problem is that {uuid}
macro makes sense only if the table has the same UUID on all hosts. Otherwise replicas will get different paths in zk and will work as independent tables, not as replicas.
That's why it's allowed to use this macro only in distributed DDLs when we can generate the same UUID on the initiator. And that's why users have to manually specify UUID when adding or recovering replica (unless they use Replicated
database which manages that by itself). You can get UUID of an existing table from system.tables
or from SHOW CREATE
query with show_table_uuid_in_table_create_query_if_not_nil
setting.
See also https://github.com/ClickHouse/ClickHouse/issues/12135 and linked issues.
/clickhouse/tables/{shard}/{database}/{table}
{database}
and {table}
macros are tricky too. Old versions of ClickHouse did not expand these macros on table creation, so RENAME TABLE
might break it ( see https://github.com/ClickHouse/ClickHouse/issues/6917). Now these macros are expanded and materialized on table creation:
dell9510 :) create table rmt on cluster test_cluster (n int) engine=ReplicatedMergeTree('/test/{shard}/{uuid}/{database}/{table}', '{replica}') order by n
Ok.
dell9510 :) show create table rmt
┌─statement───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.rmt
(
`n` Int32
)
ENGINE = ReplicatedMergeTree('/test/{shard}/{uuid}/default/rmt', '{replica}')
ORDER BY n
SETTINGS index_granularity = 8192 │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
And currently it's not possible to rename old replicated tables that have unexpanded {database}
or {table}
macro in metadata:
https://github.com/ClickHouse/ClickHouse/blob/4d146b05a959e52c004df3ef5da986408d19adb4/src/Storages/StorageReplicatedMergeTree.cpp#L5188-L5192
However, materialization of these macros may lead to unexpected behavior too, see https://github.com/ClickHouse/ClickHouse/issues/20243
And finally, about {shard}
and {replica}
macros: these macros are replaced with the values from config file, but if no values are defined in config and database is Replicated
, then values are taken from database engine arguments (https://github.com/ClickHouse/ClickHouse/issues/31471).
Hope I explained everything...
Thank you so much for the explanation @tavplubix ! That has cleared a few things up for me. Using clickhouse keeper has been a very good experience so far in my simple testing, but dropping tables then changing the configuration gave a confusing message (I thought when I dropped a table on the cluster, the ZK info would also be dropped).
I spent a few hours today learning how the replication works in ClickHouse, and as a lot has changed (it's so great 😍 :) some of the pages are slightly out of date (for example I got stuck as I have ipv6 disabled and had to add enable_ipv6
to <keeper_server>
.
As mentioned, a lot of the documentation is a little outdated, or spread through multiple pages, as obviously there is a no "one size fits all" solution and every use case is different :-). Especially around scale and if you're in a testing environment or a production environment.
Perhaps it would be worthwhile expanding the pages around introducing replication to have a simple use case of maybe 2-3 nodes (as some pages already have this (Data Replication). I can't find more example pages in the documentation now, but there are some very good examples there.
Maybe a guide in SRE (or maybe I'll just write a blog post covering my experience) would be useful that would cover the following:
create database foo
and create database foo on cluster 'something'
.create table foo
and create table foo on cluster 'something'
.
It seems to me replicated mergetree tables now automatically have a good data path when created. The docs go into detail about making them unique, but doesn't {uuid} already do this?
https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables
This is from the most recent replicated mergetree table I created, is the ENGINE line sufficient?:
Here are the macros I used:
Should the docs at https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables be updated?
@antonio2368 @alesapin @gingerwizard @e-mars @tom-clickhouse