quickwit-oss / quickwit

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
https://quickwit.io
Other
8.36k stars 344 forks source link

Metastore GRPC disconnection? #1986

Open fulmicoton opened 2 years ago

fulmicoton commented 2 years ago

Even on a small cluster, a user is experiencing disconnection with the gRPC connection to the metastore service.

guidao commented 2 years ago

I encountered the same problem.

2022-09-20T07:17:53.854Z  INFO quickwit_metastore::metastore::grpc_metastore: 571: Adding gRPC address `ip1:7281` to `MetastoreGrpcClient`.
2022-09-20T07:18:03.854Z  INFO quickwit_metastore::metastore::grpc_metastore: 571: Adding gRPC address `ip2:7281` to `MetastoreGrpcClient`.
2022-09-20T07:18:14.854Z  INFO quickwit_metastore::metastore::grpc_metastore: 571: Adding gRPC address `ip3:7281` to `MetastoreGrpcClient`.
2022-09-20T07:18:25.854Z  INFO quickwit_metastore::metastore::grpc_metastore: 571: Adding gRPC address `ip2:7281` to `MetastoreGrpcClient`.
2022-09-20T07:26:19.913Z  INFO {actor=quickwit_indexing::actors::indexing_service::IndexingService}:{msg_id=1}::{msg_id=1205}:{index=logs gen=0}: quickwit_indexing::actors::indexing_pipeline: 220: Spawning indexing pipeline. index_id=logs source_id=kafka pipeline_ord=0 root_dir=/quickwit/qwdata/indexing/logs/kafka merge_policy=StableMultitenantWithTimestampMergePolicy { min_level_num_docs: 100000, merge_enabled: true, merge_factor: 10, max_merge_factor: 12, split_num_docs_target: 10000000 }
2022-09-20T07:26:19.913Z  INFO quickwit_metastore::metastore::grpc_metastore: 571: Adding gRPC address `ip3:7281` to `MetastoreGrpcClient`.
2022-09-20T07:26:35.465Z  INFO {actor=quickwit_indexing::actors::indexing_service::IndexingService}:{msg_id=1}::{msg_id=1205}:{index=logs gen=0}: quickwit_indexing::actors::indexing_pipeline: 220: Spawning indexing pipeline. index_id=logs source_id=.ingest-api pipeline_ord=0 root_dir=/quickwit/qwdata/indexing/logs/.ingest-api merge_policy=StableMultitenantWithTimestampMergePolicy { min_level_num_docs: 100000, merge_enabled: true, merge_factor: 10, max_merge_factor: 12, split_num_docs_target: 10000000 }
2022-09-20T07:26:49.914Z ERROR {actor=quickwit_indexing::actors::indexing_service::IndexingService}:{msg_id=1}::{msg_id=1205}: quickwit_indexing::actors::indexing_pipeline: 581: Error while spawning indexing pipeline, retrying after some time. error=Connection error: `gRPC request timeout triggered by the channel timeout. This can happens when tonic channel has no registered endpoints.`. retry_count=17 retry_delay=600s
2022-09-20T07:26:55.854Z  INFO quickwit_metastore::metastore::grpc_metastore: 571: Adding gRPC address `ip2:7281` to `MetastoreGrpcClient`.
2022-09-20T07:27:02.854Z  INFO quickwit_metastore::metastore::grpc_metastore: 571: Adding gRPC address `ip1:7281` to `MetastoreGrpcClient`.
2022-09-20T07:27:05.465Z ERROR {actor=quickwit_indexing::actors::indexing_service::IndexingService}:{msg_id=1}::{msg_id=1205}: quickwit_indexing::actors::indexing_pipeline: 581: Error while spawning indexing pipeline, retrying after some time. error=Connection error: `gRPC request timeout triggered by the channel timeout. This can happens when tonic channel has no registered endpoints.`. retry_count=17 retry_delay=600s

There are a lot of splits here, and the timeout occurs when the searcher queries.

metastore=# select split_state, count(1) from splits group by split_state;
 split_state | count
-------------+-------
 Staged      |   158
 Published   | 30909