aws / amazon-ecs-service-connect-agent

Amazon ECS Service Connect Agent
Apache License 2.0
27 stars 10 forks source link

[BUG] #41

Open abespalko opened 1 year ago

abespalko commented 1 year ago

Summary

ECS Service connect agent becomes unhealthy.

Description

I have ECS cluster with several Services running under EC2 launch type (2-3 t2.micro) and some of them use ServiceConnect feature of AWS ECS. However, from time to time services that are in bridge or awsvpc network are just restarted due to ecs-service-connect-agent:interface-v1 sidecar container becomes UNHEALTHY. All subsequent deployments of Services to that instance are failing due to ecs service connect agent healthcheck. After that I have to restart EC2 instance (maybe docker restart would also help?).

Backend:

"networkMode": "bridge",
"portMappings": [
                {
                    "name": "backend-8080-tcp",
                    "containerPort": 8080,
                    "hostPort": 80,
                    "protocol": "tcp"
                }
            ],
"ServiceConnect": enabled, client only

Redis:

"networkMode": "awsvpc", 
"portMappings": [
                {
                    "name": "redis-6379-tcp",
                    "containerPort": 6379,
                    "hostPort": 6379,
                    "protocol": "tcp"
                }
            ],
"ServiceConnect": enabled, server & client

Frontend:

 "portMappings": [
                {
                    "containerPort": 3000,
                    "hostPort": 81,
                    "protocol": "tcp"
                }
            ],
Network: default,
ServiceConnect: disabled.

Expected Behavior

Container Image: ecs-service-connect-agent:interface-v1 Logs on success deployment:

time="2023-08-15T08:04:18Z" level=info msg="Envoy Environment Variables: [ENVOY_ADMIN_MODE=UDS ENVOY_CONCURRENCY=2 ENVOY_ENABLE_IAM_AUTH_FOR_XDS=0]"
time="2023-08-15T08:04:18Z" level=info msg="Agent Environment Variables: [APPNET_AGENT_ADMIN_MODE=uds APPNET_AGENT_ADMIN_UDS_PATH=/var/run/ecs/appnet_admin.sock APPNET_ENVOY_RESTART_COUNT=3 APPNET_LISTENER_PORT_MAPPING={\"egress\":38221,\"ingress-redis\":42106}]"
[2023-08-15 08:04:18.557][1][info] [AppNet Agent] Server started, /var/run/ecs/appnet_admin.sock
[2023-08-15 08:04:18.557][1][info] [AppNet Agent] Executing command: [/usr/bin/envoy -c /tmp/envoy-config-594589294.yaml -l info --concurrency 2 --drain-time-s 20]
[2023-08-15 08:04:18.669][15][info][main] [source/server/server.cc:404] initializing epoch 0 (base id=0, hot restart version=11.104)
[2023-08-15 08:04:18.669][15][info][main] [source/server/server.cc:406] statically linked extensions:
[2023-08-15 08:04:18.669][15][info][main] [source/server/server.cc:408]   network.connection.client: default, envoy_internal
[2023-08-15 08:04:18.669][15][info][main] [source/server/server.cc:408]   envoy.retry_priorities: envoy.retry_priorities.previous_priorities
[2023-08-15 08:04:18.669][15][info][main] [source/server/server.cc:408]   envoy.matching.action: envoy.matching.actions.format_string, filter-chain-name
[2023-08-15 08:04:18.670][15][info][main] [source/server/server.cc:408]   quic.http_server_connection: quic.http_server_connection.default
[2023-08-15 08:04:18.670][15][info][main] [source/server/server.cc:408]   envoy.thrift_proxy.filters: envoy.filters.thrift.header_to_metadata, envoy.filters.thrift.payload_to_metadata, envoy.filters.thrift.rate_limit, envoy.filters.thrift.router
[2023-08-15 08:04:18.670][15][info][main] [source/server/server.cc:408]   envoy.tracers: envoy.dynamic.ot, envoy.tracers.datadog, envoy.tracers.dynamic_ot, envoy.tracers.opencensus, envoy.tracers.opentelemetry, envoy.tracers.skywalking, envoy.tracers.xray, envoy.tracers.zipkin, envoy.zipkin

...

[2023-08-15 08:04:18.678][15][info][main] [source/server/server.cc:459]   response trailer map: 144 bytes: grpc-message,grpc-status
[2023-08-15 08:04:18.760][15][info][main] [source/server/server.cc:819] runtime: layers:
  - name: static_layer_0
    static_layer:
      envoy.reloadable_features.sanitize_original_path: true
      envoy.reloadable_features.http_set_tracing_decision_in_request_id: true
      envoy.reloadable_features.tcp_pool_idle_timeout: true
      envoy.reloadable_features.no_extension_lookup_by_name: true
      envoy.features.enable_all_deprecated_features: true
      re2.max_program_size.error_level: 1000
  - name: admin_layer
    admin_layer:
      {}
[2023-08-15 08:04:18.761][15][info][admin] [source/server/admin/admin.cc:67] admin address: /tmp/envoy_admin.sock
[2023-08-15 08:04:18.762][15][info][config] [source/server/configuration_impl.cc:131] loading tracing configuration
[2023-08-15 08:04:18.762][15][info][config] [source/server/configuration_impl.cc:91] loading 0 static secret(s)
[2023-08-15 08:04:18.762][15][info][config] [source/server/configuration_impl.cc:97] loading 0 cluster(s)
[2023-08-15 08:04:18.763][15][info][config] [source/server/configuration_impl.cc:101] loading 0 listener(s)
[2023-08-15 08:04:18.763][15][info][config] [source/server/configuration_impl.cc:113] loading stats configuration
[2023-08-15 08:04:18.764][15][info][runtime] [source/common/runtime/runtime_impl.cc:463] RTDS has finished initialization
[2023-08-15 08:04:18.764][15][info][upstream] [source/common/upstream/cluster_manager_impl.cc:222] cm init: initializing cds
[2023-08-15 08:04:18.764][15][warning][main] [source/server/server.cc:794] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections
[2023-08-15 08:04:18.764][15][info][main] [source/server/server.cc:915] starting main dispatch loop
[2023-08-15 08:04:18.804][15][info][upstream] [source/common/upstream/cds_api_helper.cc:35] cds: add 2 cluster(s), remove 0 cluster(s)
[2023-08-15 08:04:18.809][15][info][upstream] [source/common/upstream/cds_api_helper.cc:72] cds: added/updated 2 cluster(s), skipped 0 unmodified cluster(s)
[2023-08-15 08:04:18.809][15][info][upstream] [source/common/upstream/cluster_manager_impl.cc:196] cm init: initializing secondary clusters
[2023-08-15 08:04:24.811][15][info][upstream] [source/common/upstream/cluster_manager_impl.cc:226] cm init: all clusters initialized
[2023-08-15 08:04:24.811][15][info][main] [source/server/server.cc:896] all clusters initialized. initializing init manager
[2023-08-15 08:04:25.872][15][info][upstream] [source/extensions/listener_managers/listener_manager/lds_api.cc:82] lds: add/update listener 'ingress-redis'
[2023-08-15 08:04:25.957][15][info][upstream] [source/extensions/listener_managers/listener_manager/lds_api.cc:82] lds: add/update listener 'egress'
[2023-08-15 08:04:25.958][15][info][config] [source/extensions/listener_managers/listener_manager/listener_manager_impl.cc:852] all dependencies initialized. starting workers

Observed Behavior


time="2023-08-15T07:17:31Z" level=info msg="App Mesh Environment Variables: [APPMESH_RESOURCE_ARN=arn:aws:ecs:us-east-1:639434372228:task-set/prod-ECSCluster-V6MzUj4NX7pS/prod-redis-3ju4Jrm839kMd/ecs-svc/6456803592997768987 APPMESH_XDS_ENDPOINT=unix:///var/run/ecs/relay/envoy_xds.sock APPMESH_METRIC_EXTENSION_VERSION=1]"
--
time="2023-08-15T07:17:31Z" level=info msg="Envoy Environment Variables: [ENVOY_ENABLE_IAM_AUTH_FOR_XDS=0 ENVOY_ADMIN_MODE=UDS ENVOY_CONCURRENCY=2]"
time="2023-08-15T07:17:31Z" level=info msg="Agent Environment Variables: [APPNET_AGENT_ADMIN_MODE=uds APPNET_ENVOY_RESTART_COUNT=3 APPNET_AGENT_ADMIN_UDS_PATH=/var/run/ecs/appnet_admin.sock APPNET_LISTENER_PORT_MAPPING={\"egress\":38431,\"ingress-redis\":34415}]"
[2023-08-15 07:17:31.634][1][info] [AppNet Agent] Server started, /var/run/ecs/appnet_admin.sock
[2023-08-15 07:17:31.635][1][info] [AppNet Agent] Executing command: [/usr/bin/envoy -c /tmp/envoy-config-2133285547.yaml -l info --concurrency 2 --drain-time-s 20]
[2023-08-15 07:17:32.024][14][info][main] [source/server/server.cc:404] initializing epoch 0 (base id=0, hot restart version=11.104)
[2023-08-15 07:17:32.024][14][info][main] [source/server/server.cc:406] statically linked extensions:
[2023-08-15 07:17:32.024][14][info][main] [source/server/server.cc:408]   envoy.wasm.runtime: envoy.wasm.runtime.null, envoy.wasm.runtime.v8
[2023-08-15 07:17:32.024][14][info][main] [source/server/server.cc:408]   envoy.http.early_header_mutation: envoy.http.early_header_mutation.header_mutation
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.quic.server.crypto_stream: envoy.quic.crypto_stream.server.quiche
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.matching.input_matchers: envoy.matching.matchers.consistent_hashing, envoy.matching.matchers.ip
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.upstreams: envoy.filters.connection_pools.tcp.generic
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.http.original_ip_detection: envoy.http.original_ip_detection.custom_header, envoy.http.original_ip_detection.xff
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.transport_sockets.upstream: envoy.transport_sockets.alts, envoy.transport_sockets.http_11_proxy, envoy.transport_sockets.internal_upstream, envoy.transport_sockets.quic, envoy.transport_sockets.raw_buffer, envoy.transport_sockets.starttls, envoy.transport_sockets.tap, envoy.transport_sockets.tcp_stats, envoy.transport_sockets.tls, envoy.transport_sockets.upstream_proxy_protocol, raw_buffer, starttls, tls
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.matching.http.custom_matchers: envoy.matching.custom_matchers.trie_matcher
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.network.dns_resolver: envoy.network.dns_resolver.cares, envoy.network.dns_resolver.getaddrinfo
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.route.early_data_policy: envoy.route.early_data_policy.default
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.filters.udp_listener: envoy.filters.udp.dns_filter, envoy.filters.udp_listener.udp_proxy
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.http.stateful_header_formatters: envoy.http.stateful_header_formatters.preserve_case, preserve_case
[2023-08-15 07:17:32.025][14][info][main] [source/server/server.cc:408]   envoy.dubbo_proxy.filters: envoy.filters.dubbo.router
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.resolvers: envoy.ip
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.quic.connection_id_generator: envoy.quic.deterministic_connection_id_generator
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.health_checkers: envoy.health_checkers.redis, envoy.health_checkers.thrift
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.http.custom_response: envoy.extensions.http.custom_response.local_response_policy, envoy.extensions.http.custom_response.redirect_policy
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.path.rewrite: envoy.path.rewrite.uri_template.uri_template_rewriter
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.regex_engines: envoy.regex_engines.google_re2
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.resource_monitors: envoy.resource_monitors.fixed_heap, envoy.resource_monitors.injected_resource
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.rbac.matchers: envoy.rbac.matchers.upstream_ip_port
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.clusters: envoy.cluster.eds, envoy.cluster.logical_dns, envoy.cluster.original_dst, envoy.cluster.static, envoy.cluster.strict_dns, envoy.clusters.aggregate, envoy.clusters.dynamic_forward_proxy, envoy.clusters.redis
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.matching.http.input: envoy.matching.inputs.destination_ip, envoy.matching.inputs.destination_port, envoy.matching.inputs.direct_source_ip, envoy.matching.inputs.dns_san, envoy.matching.inputs.request_headers, envoy.matching.inputs.request_trailers, envoy.matching.inputs.response_headers, envoy.matching.inputs.response_trailers, envoy.matching.inputs.server_name, envoy.matching.inputs.source_ip, envoy.matching.inputs.source_port, envoy.matching.inputs.source_type, envoy.matching.inputs.status_code_class_input, envoy.matching.inputs.status_code_input, envoy.matching.inputs.subject, envoy.matching.inputs.uri_san
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.compression.decompressor: envoy.compression.brotli.decompressor, envoy.compression.gzip.decompressor, envoy.compression.zstd.decompressor
[2023-08-15 07:17:32.026][14][info][main] [source/server/server.cc:408]   envoy.access_loggers: envoy.access_loggers.file, envoy.access_loggers.http_grpc, envoy.access_loggers.open_telemetry, envoy.access_loggers.stderr, envoy.access_loggers.stdout, envoy.access_loggers.tcp_grpc, envoy.access_loggers.wasm, envoy.file_access_log, envoy.http_grpc_access_log, envoy.open_telemetry_access_log, envoy.stderr_access_log, envoy.stdout_access_log, envoy.tcp_grpc_access_log, envoy.wasm_access_log
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.retry_host_predicates: envoy.retry_host_predicates.omit_canary_hosts, envoy.retry_host_predicates.omit_host_metadata, envoy.retry_host_predicates.previous_hosts
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.common.key_value: envoy.key_value.file_based
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.filters.http: envoy.bandwidth_limit, envoy.buffer, envoy.cors, envoy.csrf, envoy.ext_authz, envoy.ext_proc, envoy.fault, envoy.filters.http.adaptive_concurrency, envoy.filters.http.admission_control, envoy.filters.http.alternate_protocols_cache, envoy.filters.http.aws_lambda, envoy.filters.http.aws_request_signing, envoy.filters.http.bandwidth_limit, envoy.filters.http.buffer, envoy.filters.http.cache, envoy.filters.http.cdn_loop, envoy.filters.http.composite, envoy.filters.http.compressor, envoy.filters.http.cors, envoy.filters.http.csrf, envoy.filters.http.custom_response, envoy.filters.http.decompressor, envoy.filters.http.dynamic_forward_proxy, envoy.filters.http.ext_authz, envoy.filters.http.ext_proc, envoy.filters.http.fault, envoy.filters.http.file_system_buffer, envoy.filters.http.gcp_authn, envoy.filters.http.grpc_http1_bridge, envoy.filters.http.grpc_http1_reverse_bridge, envoy.filters.http.grpc_json_transcoder, envoy.filters.http.grpc_stats, envoy.filters.http.grpc_web, envoy.filters.http.header_to_metadata, envoy.filters.http.health_check, envoy.filters.http.ip_tagging, envoy.filters.http.jwt_authn, envoy.filters.http.local_ratelimit, envoy.filters.http.lua, envoy.filters.http.match_delegate, envoy.filters.http.oauth2, envoy.filters.http.on_demand, envoy.filters.http.original_src, envoy.filters.http.rate_limit_quota, envoy.filters.http.ratelimit, envoy.filters.http.rbac, envoy.filters.http.router, envoy.filters.http.set_metadata, envoy.filters.http.stateful_session, envoy.filters.http.tap, envoy.filters.http.wasm, envoy.grpc_http1_bridge, envoy.grpc_json_transcoder, envoy.grpc_web, envoy.health_check, envoy.ip_tagging, envoy.local_rate_limit, envoy.lua, envoy.rate_limit, envoy.router
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.http.header_validators: envoy.http.header_validators.envoy_default
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.internal_redirect_predicates: envoy.internal_redirect_predicates.allow_listed_routes, envoy.internal_redirect_predicates.previous_routes, envoy.internal_redirect_predicates.safe_cross_scheme
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.grpc_credentials: envoy.grpc_credentials.aws_iam, envoy.grpc_credentials.default, envoy.grpc_credentials.file_based_metadata
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.thrift_proxy.protocols: auto, binary, binary/non-strict, compact, twitter
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.transport_sockets.downstream: envoy.transport_sockets.alts, envoy.transport_sockets.quic, envoy.transport_sockets.raw_buffer, envoy.transport_sockets.starttls, envoy.transport_sockets.tap, envoy.transport_sockets.tcp_stats, envoy.transport_sockets.tls, raw_buffer, starttls, tls
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.compression.compressor: envoy.compression.brotli.compressor, envoy.compression.gzip.compressor, envoy.compression.zstd.compressor
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.listener_manager_impl: envoy.listener_manager_impl.default
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.thrift_proxy.transports: auto, framed, header, unframed
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.matching.network.custom_matchers: envoy.matching.custom_matchers.trie_matcher
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.connection_handler: envoy.connection_handler.default
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.dubbo_proxy.protocols: dubbo
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.access_loggers.extension_filters: envoy.access_loggers.extension_filters.cel
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.quic.proof_source: envoy.quic.proof_source.filter_chain
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.matching.common_inputs: envoy.matching.common_inputs.environment_variable
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   quic.http_server_connection: quic.http_server_connection.default
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.thrift_proxy.filters: envoy.filters.thrift.header_to_metadata, envoy.filters.thrift.payload_to_metadata, envoy.filters.thrift.rate_limit, envoy.filters.thrift.router
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.dubbo_proxy.serializers: dubbo.hessian2
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.filters.http.upstream: envoy.buffer, envoy.filters.http.admission_control, envoy.filters.http.buffer, envoy.filters.http.upstream_codec
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.matching.action: envoy.matching.actions.format_string, filter-chain-name
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.tracers: envoy.dynamic.ot, envoy.tracers.datadog, envoy.tracers.dynamic_ot, envoy.tracers.opencensus, envoy.tracers.opentelemetry, envoy.tracers.skywalking, envoy.tracers.xray, envoy.tracers.zipkin, envoy.zipkin
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.guarddog_actions: envoy.watchdog.abort_action, envoy.watchdog.profile_action
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.stats_sinks: envoy.dog_statsd, envoy.graphite_statsd, envoy.metrics_service, envoy.stat_sinks.dog_statsd, envoy.stat_sinks.graphite_statsd, envoy.stat_sinks.hystrix, envoy.stat_sinks.metrics_service, envoy.stat_sinks.statsd, envoy.stat_sinks.wasm, envoy.statsd
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.load_balancing_policies: envoy.load_balancing_policies.least_request, envoy.load_balancing_policies.random, envoy.load_balancing_policies.round_robin
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.http.stateful_session: envoy.http.stateful_session.cookie, envoy.http.stateful_session.header
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.http.cache: envoy.extensions.http.cache.file_system_http_cache, envoy.extensions.http.cache.simple
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   network.connection.client: default, envoy_internal
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.request_id: envoy.request_id.uuid
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.udp_packet_writer: envoy.udp_packet_writer.default, envoy.udp_packet_writer.gso
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.matching.network.input: envoy.matching.inputs.application_protocol, envoy.matching.inputs.destination_ip, envoy.matching.inputs.destination_port, envoy.matching.inputs.direct_source_ip, envoy.matching.inputs.dns_san, envoy.matching.inputs.server_name, envoy.matching.inputs.source_ip, envoy.matching.inputs.source_port, envoy.matching.inputs.source_type, envoy.matching.inputs.subject, envoy.matching.inputs.transport_protocol, envoy.matching.inputs.uri_san
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.tls.cert_validator: envoy.tls.cert_validator.default, envoy.tls.cert_validator.spiffe
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.filters.listener: envoy.filters.listener.http_inspector, envoy.filters.listener.original_dst, envoy.filters.listener.original_src, envoy.filters.listener.proxy_protocol, envoy.filters.listener.tls_inspector, envoy.listener.http_inspector, envoy.listener.original_dst, envoy.listener.original_src, envoy.listener.proxy_protocol, envoy.listener.tls_inspector
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.bootstrap: envoy.bootstrap.internal_listener, envoy.bootstrap.wasm, envoy.extensions.network.socket_interface.default_socket_interface
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.upstream_options: envoy.extensions.upstreams.http.v3.HttpProtocolOptions, envoy.extensions.upstreams.tcp.v3.TcpProtocolOptions, envoy.upstreams.http.http_protocol_options, envoy.upstreams.tcp.tcp_protocol_options
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.path.match: envoy.path.match.uri_template.uri_template_matcher
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.config.validators: envoy.config.validators.minimum_clusters, envoy.config.validators.minimum_clusters_validator
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.retry_priorities: envoy.retry_priorities.previous_priorities
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.rate_limit_descriptors: envoy.rate_limit_descriptors.expr
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.formatter: envoy.formatter.metadata, envoy.formatter.req_without_query
[2023-08-15 07:17:32.027][14][info][main] [source/server/server.cc:408]   envoy.filters.network: envoy.echo, envoy.ext_authz, envoy.filters.network.connection_limit, envoy.filters.network.direct_response, envoy.filters.network.dubbo_proxy, envoy.filters.network.echo, envoy.filters.network.ext_authz, envoy.filters.network.http_connection_manager, envoy.filters.network.local_ratelimit, envoy.filters.network.mongo_proxy, envoy.filters.network.ratelimit, envoy.filters.network.rbac, envoy.filters.network.redis_proxy, envoy.filters.network.sni_cluster, envoy.filters.network.sni_dynamic_forward_proxy, envoy.filters.network.tcp_proxy, envoy.filters.network.thrift_proxy, envoy.filters.network.wasm, envoy.filters.network.zookeeper_proxy, envoy.http_connection_manager, envoy.mongo_proxy, envoy.ratelimit, envoy.redis_proxy, envoy.tcp_proxy
[2023-08-15 07:17:32.032][14][info][main] [source/server/server.cc:456] HTTP header map info:
[2023-08-15 07:17:32.033][14][info][main] [source/server/server.cc:459]   request header map: 672 bytes: :authority,:method,:path,:protocol,:scheme,accept,accept-encoding,access-control-request-headers,access-control-request-method,access-control-request-private-network,authentication,authorization,cache-control,cdn-loop,connection,content-encoding,content-length,content-type,expect,grpc-accept-encoding,grpc-timeout,if-match,if-modified-since,if-none-match,if-range,if-unmodified-since,keep-alive,origin,pragma,proxy-connection,proxy-status,referer,te,transfer-encoding,upgrade,user-agent,via,x-client-trace-id,x-envoy-attempt-count,x-envoy-decorator-operation,x-envoy-downstream-service-cluster,x-envoy-downstream-service-node,x-envoy-expected-rq-timeout-ms,x-envoy-external-address,x-envoy-force-trace,x-envoy-hedge-on-per-try-timeout,x-envoy-internal,x-envoy-ip-tags,x-envoy-is-timeout-retry,x-envoy-max-retries,x-envoy-original-path,x-envoy-original-url,x-envoy-retriable-header-names,x-envoy-retriable-status-codes,x-envoy-retry-grpc-on,x-envoy-retry-on,x-envoy-upstream-alt-stat-name,x-envoy-upstream-rq-per-try-timeout-ms,x-envoy-upstream-rq-timeout-alt-response,x-envoy-upstream-rq-timeout-ms,x-envoy-upstream-stream-duration-ms,x-forwarded-client-cert,x-forwarded-for,x-forwarded-host,x-forwarded-port,x-forwarded-proto,x-ot-span-context,x-request-id
[2023-08-15 07:17:32.034][14][info][main] [source/server/server.cc:459]   request trailer map: 120 bytes:
[2023-08-15 07:17:32.034][14][info][main] [source/server/server.cc:459]   response header map: 432 bytes: :status,access-control-allow-credentials,access-control-allow-headers,access-control-allow-methods,access-control-allow-origin,access-control-allow-private-network,access-control-expose-headers,access-control-max-age,age,cache-control,connection,content-encoding,content-length,content-type,date,etag,expires,grpc-message,grpc-status,keep-alive,last-modified,location,proxy-connection,proxy-status,server,transfer-encoding,upgrade,vary,via,x-envoy-attempt-count,x-envoy-decorator-operation,x-envoy-degraded,x-envoy-immediate-health-check-fail,x-envoy-ratelimited,x-envoy-upstream-canary,x-envoy-upstream-healthchecked-cluster,x-envoy-upstream-service-time,x-request-id
[2023-08-15 07:17:32.034][14][info][main] [source/server/server.cc:459]   response trailer map: 144 bytes: grpc-message,grpc-status
[2023-08-15 07:17:32.128][14][info][main] [source/server/server.cc:819] runtime: layers:
- name: static_layer_0
static_layer:
envoy.features.enable_all_deprecated_features: true
envoy.reloadable_features.http_set_tracing_decision_in_request_id: true
envoy.reloadable_features.tcp_pool_idle_timeout: true
envoy.reloadable_features.sanitize_original_path: true
re2.max_program_size.error_level: 1000
envoy.reloadable_features.no_extension_lookup_by_name: true
- name: admin_layer
admin_layer:
{}
[2023-08-15 07:17:32.129][14][info][admin] [source/server/admin/admin.cc:67] admin address: /tmp/envoy_admin.sock
[2023-08-15 07:17:32.129][14][info][config] [source/server/configuration_impl.cc:131] loading tracing configuration
[2023-08-15 07:17:32.129][14][info][config] [source/server/configuration_impl.cc:91] loading 0 static secret(s)
[2023-08-15 07:17:32.130][14][info][config] [source/server/configuration_impl.cc:97] loading 0 cluster(s)
[2023-08-15 07:17:32.131][14][info][config] [source/server/configuration_impl.cc:101] loading 0 listener(s)
[2023-08-15 07:17:32.131][14][info][config] [source/server/configuration_impl.cc:113] loading stats configuration
[2023-08-15 07:17:32.131][14][info][runtime] [source/common/runtime/runtime_impl.cc:463] RTDS has finished initialization
[2023-08-15 07:17:32.131][14][info][upstream] [source/common/upstream/cluster_manager_impl.cc:222] cm init: initializing cds
[2023-08-15 07:17:32.132][14][warning][main] [source/server/server.cc:794] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections
[2023-08-15 07:17:32.132][14][info][main] [source/server/server.cc:915] starting main dispatch loop
[2023-08-15 07:17:32.132][14][warning][config] [./source/common/config/grpc_stream.h:163] StreamAggregatedResources gRPC config stream to unix:///var/run/ecs/relay/envoy_xds.sock closed: 13,
[2023-08-15 07:17:32.319][14][warning][config] [./source/common/config/grpc_stream.h:163] StreamAggregatedResources gRPC config stream to unix:///var/run/ecs/relay/envoy_xds.sock closed: 13,
[2023-08-15 07:17:32.914][14][warning][config] [./source/common/config/grpc_stream.h:163] StreamAggregatedResources gRPC config stream to unix:///var/run/ecs/relay/envoy_xds.sock closed: 13,
[2023-08-15 07:17:34.741][14][warning][config] [./source/common/config/grpc_stream.h:163] StreamAggregatedResources gRPC config stream to unix:///var/run/ecs/relay/envoy_xds.sock closed: 13,

Environment Details

docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., 0.0.0+unknown)

Server:
 Containers: 14
  Running: 7
  Paused: 0
  Stopped: 7
 Images: 5
 Server Version: 20.10.23
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f
 runc version: f19387a6bec4944c770f7668ab51c4348d9c2f38
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.14.320-242.534.amzn2.x86_64
 Operating System: Amazon Linux 2
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 970.2MiB
 Name: ip-172-14-3-154.ec2.internal
 ID: CWLP:UUAF:NIN3:WZHK:SEHB:YBDL:HUHO:PS57:2QBS:JPMW:4XE2:J2L7
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
karanvasnani commented 1 year ago

Thanks for opening this bug @abespalko. SC Agent becoming unhealthy here as indicated by the logs is due to failure to reach another component on the instance called as Relay agent that is in the path to connection with our management service.

We would need to investigate whether there was an issue on the instance agents, could you collect logs from the instance and see if something is evident there? You can use this instructions to collect all the logs: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-logs-collector.html. You can share them with us using premium support or via email at ecs-service-connect-agent-external@amazon.com.

We also noticed that your host has 1 vCPU and running 7 containers so, their maybe contention there. Please refer this doc for the recommended CPU & Memory allocation values: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-connect-concepts.html#service-connect-concepts-proxy