quickwit-oss / quickwit

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
https://quickwit.io
Other
7.92k stars 322 forks source link

DNS failure using Garage endpoint #3578

Closed jpds closed 1 year ago

jpds commented 1 year ago

Describe the bug

I just tried using Quickwit locally on my laptop pointing at a remote Garage installation, and with the following configuration:

metastore_uri: s3://qw-test/indexes
default_index_root_uri: s3://qw-test/indexes
storage:
   s3:
     access_key_id: "..."
     secret_access_key: "..."
     endpoint: https://s3.example.com
     region: "garage"
     force_path_style_access: true
     disable_multi_object_delete_requests: false

With this, it would simply return:

2023-06-26T10:12:55.333Z  INFO quickwit_cli: Loaded node config. config_uri=file:///.../config/quickwit.yaml config=QuickwitConfig { cluster_id: "quickwit-default-cluster", node_id: "host", enabled_services: {Searcher, Janitor, Indexer, Metastore, ControlPlane}, rest_listen_addr: 127.0.0.1:7280, gossip_listen_addr: 127.0.0.1:7280, grpc_listen_addr: 127.0.0.1:7281, gossip_advertise_addr: 127.0.0.1:7280, grpc_advertise_addr: 127.0.0.1:7281, peer_seeds: [], data_dir_path: ".../qwdata", metastore_uri: Uri { uri: "s3://qw-test/indexes" }, default_index_root_uri: Uri { uri: "s3://qw-test/indexes" }, rest_cors_allow_origins: [], storage_configs: StorageConfigs([S3(S3StorageConfig { access_key_id: Some("..."), secret_access_key: Some("***redacted***"), region: Some("garage"), endpoint: Some("https://s3.example.com"), force_path_style_access: true, disable_multi_object_delete_requests: false })]), metastore_configs: MetastoreConfigs([]), indexer_config: IndexerConfig { split_store_max_num_bytes: Byte(100000000000), split_store_max_num_splits: 1000, max_concurrent_split_uploads: 12, enable_otlp_endpoint: true, enable_cooperative_indexing: false }, searcher_config: SearcherConfig { aggregation_memory_limit: Byte(500000000), aggregation_bucket_limit: 65000, fast_field_cache_capacity: Byte(1000000000), split_footer_cache_capacity: Byte(500000000), partial_request_cache_capacity: Byte(64000000), max_num_concurrent_split_searches: 100, max_num_concurrent_split_streams: 100 }, ingest_api_config: IngestApiConfig { max_queue_memory_usage: Byte(2147483648), max_queue_disk_usage: Byte(4294967296) }, jaeger_config: JaegerConfig { enable_endpoint: true, lookback_period_hours: 72, max_trace_duration_secs: 3600, max_fetch_spans: 10000 } }
2023-06-26T10:12:55.337Z  INFO quickwit_cluster::cluster: Joining cluster. cluster_id=quickwit-default-cluster node_id=host enabled_services={Searcher, Janitor, Indexer, Metastore, ControlPlane} gossip_listen_addr=127.0.0.1:7280 gossip_advertise_addr=127.0.0.1:7280 grpc_advertise_addr=127.0.0.1:7281 peer_seed_addrs=
✘ Command failed: Failed to connect to metastore: `Internal error: `Failed to get index files.` Cause: `StorageError(kind=Io, source=dispatch failure: io error: error trying to connect: dns error: failed to lookup address information: Name or service not known: dns error: failed to lookup address information: Name or service not known: failed to lookup address information: Name or service not known (DispatchFailure(DispatchFailure { source: ConnectorError { kind: Io, source: hyper::Error(Connect, ConnectError("dns error", Custom { kind: Uncategorized, error: "failed to lookup address information: Name or service not known" })), connection: Unknown } })))`.`

I was only able to make it work after setting:

$ export QW_S3_ENDPOINT=https://s3.example.com
$ export QW_S3_FORCE_PATH_STYLE_ACCESS=true
$ quickwit run
2023-06-26T10:13:52.855Z  INFO quickwit_config::quickwit_config::serialize: Using listen address as advertise address. advertise_address=127.0.0.1
2023-06-26T10:13:52.857Z  WARN quickwit_config::quickwit_config::serialize: Cluster ID is not set, falling back to default value: `quickwit-default-cluster`.
2023-06-26T10:13:52.857Z  WARN quickwit_config::quickwit_config::serialize: Peer seed list is empty.
2023-06-26T10:13:52.857Z  INFO quickwit_cli: Loaded node config. config_uri=file:///.../config/quickwit.yaml config=QuickwitConfig { cluster_id: "quickwit-default-cluster", node_id: "host", enabled_services: {ControlPlane, Janitor, Searcher, Indexer, Metastore}, rest_listen_addr: 127.0.0.1:7280, gossip_listen_addr: 127.0.0.1:7280, grpc_listen_addr: 127.0.0.1:7281, gossip_advertise_addr: 127.0.0.1:7280, grpc_advertise_addr: 127.0.0.1:7281, peer_seeds: [], data_dir_path: ".../qwdata", metastore_uri: Uri { uri: "s3://qw-test/indexes" }, default_index_root_uri: Uri { uri: "s3://qw-test/indexes" }, rest_cors_allow_origins: [], storage_configs: StorageConfigs([S3(S3StorageConfig { access_key_id: Some("..."), secret_access_key: Some("***redacted***"), region: Some("garage"), endpoint: Some("https://s3.example.com"), force_path_style_access: true, disable_multi_object_delete_requests: false })]), metastore_configs: MetastoreConfigs([]), indexer_config: IndexerConfig { split_store_max_num_bytes: Byte(100000000000), split_store_max_num_splits: 1000, max_concurrent_split_uploads: 12, enable_otlp_endpoint: true, enable_cooperative_indexing: false }, searcher_config: SearcherConfig { aggregation_memory_limit: Byte(500000000), aggregation_bucket_limit: 65000, fast_field_cache_capacity: Byte(1000000000), split_footer_cache_capacity: Byte(500000000), partial_request_cache_capacity: Byte(64000000), max_num_concurrent_split_searches: 100, max_num_concurrent_split_streams: 100 }, ingest_api_config: IngestApiConfig { max_queue_memory_usage: Byte(2147483648), max_queue_disk_usage: Byte(4294967296) }, jaeger_config: JaegerConfig { enable_endpoint: true, lookback_period_hours: 72, max_trace_duration_secs: 3600, max_fetch_spans: 10000 } }
2023-06-26T10:13:52.857Z  INFO quickwit_telemetry::sender: telemetry to quickwit is disabled.
2023-06-26T10:13:52.859Z  INFO quickwit_cluster::cluster: Joining cluster. cluster_id=quickwit-default-cluster node_id=host enabled_services={ControlPlane, Janitor, Searcher, Indexer, Metastore} gossip_listen_addr=127.0.0.1:7280 gossip_advertise_addr=127.0.0.1:7280 grpc_advertise_addr=127.0.0.1:7281 peer_seed_addrs=
2023-06-26T10:13:52.869Z  INFO quickwit_storage::object_storage::s3_compatible_storage: Using custom S3 endpoint. endpoint=https://s3.example.com
2023-06-26T10:13:53.162Z  WARN quickwit_control_plane::scheduler: No indexer available, cannot schedule an indexing plan.
2023-06-26T10:13:53.164Z  INFO quickwit_ingest::ingest_api_service: Ingest API partition id ingest_partition_id=ingest_partition_01H3VK43WCKVC82E2KBWDM1AE7
2023-06-26T10:13:53.179Z  INFO quickwit_config::index_config::serialize: Index config does not specify `index_uri`, falling back to default value. index_id=otel-logs-v0_6 index_uri=s3://qw-test/indexes/otel-logs-v0_6
2023-06-26T10:13:53.182Z  INFO quickwit_config::index_config::serialize: Index config does not specify `index_uri`, falling back to default value. index_id=otel-traces-v0_6 index_uri=s3://qw-test/indexes/otel-traces-v0_6
2023-06-26T10:13:53.183Z  INFO quickwit_storage::object_storage::s3_compatible_storage: Using custom S3 endpoint. endpoint=https://s3.example.com
...

However, I'd expect it to detect the custom endpoint without setting the environment variables.

fulmicoton commented 1 year ago

That's a bug for @trinity-1686a :D

trinity-1686a commented 1 year ago

Hi @jpds

I'm not able to reproduce this issue with Deuxfleurs Garage cluster. By default, Quickwit tries to use vhost style when interacting with S3. Garage supports vhost style too, but you need a wildcard DNS entry to use that (CNAME *.s3.example.com => s3.example.com). Can you check if you have such an entry, or at least one for qw-test.s3.example.com? If you can't create such a DNS entry, you should indeed use QW_S3_FORCE_PATH_STYLE_ACCESS=true

wait, force_path_style_access: true is already set in the configuration, so there is indeed a bug there, the environment variable shouldn't be required :confused:

guilload commented 1 year ago

Closed via #3583.