Docs about creating a replicated table may be out of date

DanRoscigno commented 2 years ago

It seems to me replicated mergetree tables now automatically have a good data path when created. The docs go into detail about making them unique, but doesn't {uuid} already do this?

https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables

This is from the most recent replicated mergetree table I created, is the ENGINE line sufficient?:

ENGINE = ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')

Here are the macros I used:

<clickhouse>
    <distributed_ddl>
            <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
    <macros>
        <cluster>cluster_1S_2R</cluster>
        <shard>1</shard>
        <replica>replica_1</replica>
    </macros>
</clickhouse>

Should the docs at https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables be updated?

@antonio2368 @alesapin @gingerwizard @e-mars @tom-clickhouse

tom-clickhouse commented 2 years ago

I agree and ENGINE = ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}') could be even simplified to ENGINE = ReplicatedMergeTree if these macros are there by default

<default_replica_path>/clickhouse/tables/{shard}/{database}/{table}</default_replica_path>
<default_replica_name>{replica}</default_replica_name>

(but I don't see them in the default ClickHouse server configuration file)

den-crane commented 2 years ago

side note: I think it should have {cluster}

<default_replica_path>/clickhouse/{cluster}/tables/{shard}/{database}/{table}</default_replica_path>

DanRoscigno commented 2 years ago

side note: I think it should have {cluster}

<default_replica_path>/clickhouse/{cluster}/tables/{shard}/{database}/{table}</default_replica_path>

Without {uuid} @den-crane ?

DanRoscigno commented 2 years ago

I agree and ENGINE = ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}') could be even simplified to ENGINE = ReplicatedMergeTree if these macros are there by default
<default_replica_path>/clickhouse/tables/{shard}/{database}/{table}</default_replica_path>
<default_replica_name>{replica}</default_replica_name>
(but I don't see them in the default ClickHouse server configuration file)

The default comes from here @tom-clickhouse :

find . -type f | xargs grep default_replica_path|grep -v "^\.\/docs"

./src/Storages/StorageReplicatedMergeTree.cpp:    return config.getString("default_replica_path", "/clickhouse/tables/{uuid}/{shard}");

den-crane commented 2 years ago

Without {uuid} @den-crane ?

It's more complicated, I don't know how to solve it in create table page. {uuid} can be used only with on cluster, of if you use a special create table ... uuid syntax. It's huge work to make create table page to reflect the current state.

tavplubix commented 2 years ago

First of all, there's no magic default path that will always work for everyone (I tried, but it's just impossible). And yes, logic around paths in zk became way too complicated.

ReplicatedReplacingMergeTree('/clickhouse/tables/{layer}-{shard}/table_name', '{replica}', ver)

{layer} is something outdated from Yandex.Metrica, maybe we can update the example.

ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')

Yes, it's current default and it should work perfectly with Replicated database. It also works with ON CLUSTER queries, but it's not convenient to use when you need to add new replicas or restore replicas after disk failure.

The {uuid} macro is tricky. It takes UUID from the UUID '...' clause from the CREATE TABLE statement. This clause is optional, so if no UUID is specified in CREATE TABLE query (as recommended), then random UUID is generated and this clause is added automatically. The problem is that {uuid} macro makes sense only if the table has the same UUID on all hosts. Otherwise replicas will get different paths in zk and will work as independent tables, not as replicas.

That's why it's allowed to use this macro only in distributed DDLs when we can generate the same UUID on the initiator. And that's why users have to manually specify UUID when adding or recovering replica (unless they use Replicated database which manages that by itself). You can get UUID of an existing table from system.tables or from SHOW CREATE query with show_table_uuid_in_table_create_query_if_not_nil setting.

See also https://github.com/ClickHouse/ClickHouse/issues/12135 and linked issues.

/clickhouse/tables/{shard}/{database}/{table}

{database} and {table} macros are tricky too. Old versions of ClickHouse did not expand these macros on table creation, so RENAME TABLE might break it ( see https://github.com/ClickHouse/ClickHouse/issues/6917). Now these macros are expanded and materialized on table creation:

dell9510 :) create table rmt on cluster test_cluster (n int) engine=ReplicatedMergeTree('/test/{shard}/{uuid}/{database}/{table}', '{replica}') order by n

Ok.

dell9510 :) show create table rmt

┌─statement───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.rmt
(
    `n` Int32
)
ENGINE = ReplicatedMergeTree('/test/{shard}/{uuid}/default/rmt', '{replica}')
ORDER BY n
SETTINGS index_granularity = 8192 │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

And currently it's not possible to rename old replicated tables that have unexpanded {database} or {table} macro in metadata: https://github.com/ClickHouse/ClickHouse/blob/4d146b05a959e52c004df3ef5da986408d19adb4/src/Storages/StorageReplicatedMergeTree.cpp#L5188-L5192

However, materialization of these macros may lead to unexpected behavior too, see https://github.com/ClickHouse/ClickHouse/issues/20243

And finally, about {shard} and {replica} macros: these macros are replaced with the values from config file, but if no values are defined in config and database is Replicated, then values are taken from database engine arguments (https://github.com/ClickHouse/ClickHouse/issues/31471).

Hope I explained everything...

joshuataylor commented 2 years ago

Thank you so much for the explanation @tavplubix ! That has cleared a few things up for me. Using clickhouse keeper has been a very good experience so far in my simple testing, but dropping tables then changing the configuration gave a confusing message (I thought when I dropped a table on the cluster, the ZK info would also be dropped).

I spent a few hours today learning how the replication works in ClickHouse, and as a lot has changed (it's so great 😍 :) some of the pages are slightly out of date (for example I got stuck as I have ipv6 disabled and had to add enable_ipv6 to <keeper_server>.

As mentioned, a lot of the documentation is a little outdated, or spread through multiple pages, as obviously there is a no "one size fits all" solution and every use case is different :-). Especially around scale and if you're in a testing environment or a production environment.

Perhaps it would be worthwhile expanding the pages around introducing replication to have a simple use case of maybe 2-3 nodes (as some pages already have this (Data Replication). I can't find more example pages in the documentation now, but there are some very good examples there.

Maybe a guide in SRE (or maybe I'll just write a blog post covering my experience) would be useful that would cover the following:

Setup of 2 or 3 ClickHouse instances, and what a shard is, what is a replica, why you would have multiple shards/replicas.
Setting up ClickHouse Keeper (though this could be linked to the fantastic pages here and here).
Setting up a database, and the difference between create database foo and create database foo on cluster 'something'.
Setting up a table, and the difference between create table foo and create table foo on cluster 'something'.
The points raised above, especially around how ZK works with table paths.
Tips for diagnosing when things aren't working, ClickHouse has some really great features for this once you know them. These seem to be spread out through the docs.

DanRoscigno commented 1 year ago

closed by https://github.com/ClickHouse/clickhouse-docs/pull/986

ClickHouse / ClickHouse

Docs about creating a replicated table may be out of date #40300