ClickHouse / ClickHouse

ClickHouse® is a real-time analytics DBMS
https://clickhouse.com
Apache License 2.0
36.82k stars 6.8k forks source link

MaterializedPostgreSQL connection breaks on column changes in postgres after restart #66273

Open rgmvisser opened 2 months ago

rgmvisser commented 2 months ago

Describe what's wrong

When running Clickhouse with MaterializedPostgreSQL everything is working fine, until we had to restart our Postgres instance. It seems like the MaterializedPostgreSQL tables are not being updated anymore.

Does it reproduce on the most recent release?

Yes

Enable crash reporting

LOGICAL_ERROR 49 492 2024-07-09 12:39:56 Columns number mismatch. Attributes: 17, buffer: 20

How to reproduce

  1. Setup a MaterializedPostgreSQL database:
    SET allow_experimental_database_materialized_postgresql=1;
    CREATE DATABASE postgres_tables ENGINE = MaterializedPostgreSQL('host:port', 'database', 'user', 'password') SETTINGS materialized_postgresql_tables_list = 'Table1,Table2,Table3...' FORMAT JSON;
  2. See that the table are being populated
  3. Remove a column from one of the tracked tables
  4. Restart Postgres instance (not sure if this is required, but it seems like it)
  5. BINGO: No more updates in the postgres_tables database

Expected behavior

I expect Clickhouse to handle the changes in columns as I now have to set up everything again from scratch.

Related: https://github.com/ClickHouse/ClickHouse/issues/49045

ArturFormella commented 1 month ago

I have the similar problem. It would also be great if the message informed which table is invalid.

Full stack:

2024.08.09 06:49:59.828571 [ 121 ] {} <Information> PostgreSQLReplicationHandler: Using replication slot cluster_draft and publication cluster_draft_ch_publication
2024.08.09 06:50:01.512998 [ 121 ] {} <Error> DatabaseMaterializedPostgreSQL (draft_cluster_local): Failed to start replication from PostgreSQL, will retry. Error: Code: 49. DB::Exception: Columns number mismatch. Attributes: 7, buffer: 10. (LOGICAL_ERROR), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000d01e6db
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x00000000078dd20c
2. DB::Exception::Exception<unsigned long, unsigned long&>(int, FormatStringHelperImpl<std::type_identity<unsigned long>::type, std::type_identity<unsigned long&>::type>, unsigned long&&, unsigned long&) @ 0x000000000d6885ab
3. DB::MaterializedPostgreSQLConsumer::StorageData::StorageData(DB::StorageInfo const&, std::shared_ptr<Poco::Logger>) @ 0x0000000010b401a7
4. DB::MaterializedPostgreSQLConsumer::MaterializedPostgreSQLConsumer(std::shared_ptr<DB::Context const>, std::shared_ptr<postgres::Connection>, String const&, String const&, String const&, unsigned long, bool, std::unordered_map<String, DB::StorageInfo, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, DB::StorageInfo>>>, String const&) @ 0x0000000010b3dd7d
5. DB::PostgreSQLReplicationHandler::startSynchronization(bool) @ 0x0000000010b16989
6. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<DB::DatabaseMaterializedPostgreSQL::DatabaseMaterializedPostgreSQL(std::shared_ptr<DB::Context const>, String const&, StrongTypedef<wide::integer<128ul, unsigned int>, DB::UUIDTag>, bool, String const&, String const&, postgres::ConnectionInfo const&, std::unique_ptr<DB::MaterializedPostgreSQLSettings, std::default_delete<DB::MaterializedPostgreSQLSettings>>)::$_0, void ()>>(std::__function::__policy_storage const*) @ 0x0000000010b0d6bd
7. DB::BackgroundSchedulePool::threadFunction() @ 0x00000000106537a0
8. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const*)::$_0>(DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const*)::$_0&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x0000000010654847
9. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000d0d71a3
10. ? @ 0x00007f2fe32c9609
11. ? @ 0x00007f2fe31e4353
 (version 24.6.2.17 (official build))