Altinity / clickhouse-operator

Altinity Kubernetes Operator for ClickHouse creates, configures and manages ClickHouse® clusters running on Kubernetes
https://altinity.com
Apache License 2.0
1.94k stars 464 forks source link

ClickHouse File Does not Exist Error After Shard Increase #1558

Open karthik-thiyagarajan opened 1 week ago

karthik-thiyagarajan commented 1 week ago

I started to see the below error, after I upgraded my clickhouse shards from 6 shards to 8 shards. I also started to see the DistributedFilesInsert count getting increased every 2 to 3 minutes and the total delayedInsert is also getting increased where as it was literally between 10 to 25, where as it has now increased to 125k in the last 2 days.

I see the below error - can you help to debug this ?


Defaulted container "clickhouse" out of: clickhouse, clickhouse-backup, clickhouse-log 2024.11.11 16:10:46.450235 [ 604 ] {} myproduct_monitoring.client_stats.DistributedInsertQueue.default: Code: 107. DB::ErrnoException: Cannot open file /var/lib/clickhouse/store/712/71257fd8-10bf-47e5-9d37-b4d988de7dfc/shard8_all_replicas/10030.bin: , errno: 2, strerror: No such file or directory: While sending /var/lib/clickhouse/store/712/71257fd8-10bf-47e5-9d37-b4d988de7dfc/shard8_all_replicas/10030.bin. (FILE_DOESNT_EXIST), Stack trace (when copying this message, always include the lines below):

  1. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000d02cb1b

  2. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x00000000078e1d0c

  3. DB::ErrnoException::ErrnoException<String&>(int, int, FormatStringHelperImpl<std::type_identity<String&>::type>, String&) @ 0x000000000d08783d

  4. void DB::ErrnoException::throwFromPath<String&>(int, String const&, FormatStringHelperImpl<std::type_identity<String&>::type>, String&) @ 0x000000000d0869c8

  5. DB::ReadBufferFromFile::ReadBufferFromFile(String const&, unsigned long, int, char*, unsigned long, std::optional, std::shared_ptr) @ 0x000000000d08a32b

  6. DB::DistributedAsyncInsertDirectoryQueue::processFile(String&, DB::SettingsChanges const&) @ 0x00000000123ddc2a

  7. DB::DistributedAsyncInsertDirectoryQueue::processFiles(DB::SettingsChanges const&) @ 0x00000000123d4113

  8. void std::function::policy_invoker<void ()>::call_impl<std::function::default_alloc_func<DB::DistributedAsyncInsertDirectoryQueue::DistributedAsyncInsertDirectoryQueue(DB::StorageDistributed&, std::shared_ptr const&, String const&, std::shared_ptr, DB::ActionBlocker&, DB::BackgroundSchedulePool&)::$_0, void ()>>(std::function::__policy_storage const*) @ 0x00000000123e1d55

  9. DB::BackgroundSchedulePool::threadFunction() @ 0x0000000010661860

  10. void std::function::policy_invoker<void ()>::__call_impl<std::function::default_alloc_func<ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const)::$_0>(DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const)::$_0&&)::'lambda'(), void ()>>(std::function::policy_storage const*) @ 0x0000000010662907

  11. void std::thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::thread_struct, std::default_delete>, void ThreadPoolImpl::scheduleImpl(std::function<void ()>, Priority, std::optional, bool)::'lambda0'()>>(void) @ 0x000000000d0e55e3

  12. ? @ 0x00007fd6a056d609

  13. ? @ 0x00007fd6a0488353 (version 24.6.3.95 (official build)) 2024.11.11 16:11:16.450513 [ 587 ] {} myproduct_monitoring.client_stats.DistributedInsertQueue.default: Code: 107. DB::ErrnoException: Cannot open file /var/lib/clickhouse/store/712/71257fd8-10bf-47e5-9d37-b4d988de7dfc/shard8_all_replicas/10030.bin: , errno: 2, strerror: No such file or directory: While sending /var/lib/clickhouse/store/712/71257fd8-10bf-47e5-9d37-b4d988de7dfc/shard8_all_replicas/10030.bin. (FILE_DOESNT_EXIST), Stack trace (when copying this message, always include the lines below):

  14. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000d02cb1b

  15. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x00000000078e1d0c

  16. DB::ErrnoException::ErrnoException<String&>(int, int, FormatStringHelperImpl<std::type_identity<String&>::type>, String&) @ 0x000000000d08783d

  17. void DB::ErrnoException::throwFromPath<String&>(int, String const&, FormatStringHelperImpl<std::type_identity<String&>::type>, String&) @ 0x000000000d0869c8

  18. DB::ReadBufferFromFile::ReadBufferFromFile(String const&, unsigned long, int, char*, unsigned long, std::optional, std::shared_ptr) @ 0x000000000d08a32b

  19. DB::DistributedAsyncInsertDirectoryQueue::processFile(String&, DB::SettingsChanges const&) @ 0x00000000123ddc2a

  20. DB::DistributedAsyncInsertDirectoryQueue::processFiles(DB::SettingsChanges const&) @ 0x00000000123d4113

  21. void std::function::policy_invoker<void ()>::call_impl<std::function::default_alloc_func<DB::DistributedAsyncInsertDirectoryQueue::DistributedAsyncInsertDirectoryQueue(DB::StorageDistributed&, std::shared_ptr const&, String const&, std::shared_ptr, DB::ActionBlocker&, DB::BackgroundSchedulePool&)::$_0, void ()>>(std::function::__policy_storage const*) @ 0x00000000123e1d55

  22. DB::BackgroundSchedulePool::threadFunction() @ 0x0000000010661860

  23. void std::function::policy_invoker<void ()>::__call_impl<std::function::default_alloc_func<ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const)::$_0>(DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const)::$_0&&)::'lambda'(), void ()>>(std::function::policy_storage const*) @ 0x0000000010662907

  24. void std::thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::thread_struct, std::default_delete>, void ThreadPoolImpl::scheduleImpl(std::function<void ()>, Priority, std::optional, bool)::'lambda0'()>>(void) @ 0x000000000d0e55e3

  25. ? @ 0x00007fd6a056d609

  26. ? @ 0x00007fd6a0488353 (version 24.6.3.95 (official build))

Slach commented 1 week ago

could you share kubectl get chi -n <your-namespace> <your-chi-name> -o yaml

without sensitive information?