Open cyclops23 opened 6 months ago
Hi @cyclops23! Apologies for the long delay in responding to this. Unfortunately I wasn't able to reproduce what you're seeing. I've cloned the repository you linked to and ran the deploy script, and got the following in the logs for the terminating GW proxy:
[2024-06-25 19:31:44.406][1][info][main] [source/server/server.cc:934] starting main dispatch loop
[2024-06-25 19:31:44.409][1][info][upstream] [source/common/upstream/cds_api_helper.cc:32] cds: add 1 cluster(s), remove 0 cluster(s)
[2024-06-25 19:31:44.466][1][info][upstream] [source/common/upstream/cds_api_helper.cc:71] cds: added/updated 1 cluster(s), skipped 0 unmodified cluster(s)
[2024-06-25 19:31:44.467][1][info][upstream] [source/common/upstream/cluster_manager_impl.cc:226] cm init: all clusters initialized
[2024-06-25 19:31:44.467][1][info][main] [source/server/server.cc:915] all clusters initialized. initializing init manager
[2024-06-25 19:31:44.470][1][info][upstream] [source/extensions/listener_managers/listener_manager/lds_api.cc:99] lds: add/update listener 'default:0.0.0.0:24076'
[2024-06-25 19:31:44.470][1][info][config] [source/extensions/listener_managers/listener_manager/listener_manager_impl.cc:923] all dependencies initialized. starting workers
Then I ran the following job to act as a test client (I've skipped using transparent proxy here but that should work as well):
That allocation starts up just fine, and I'm able to curl DynamoDB via the upstream:
$ nomad alloc exec -task task 83fd /bin/sh
~ $ curl localhost:8080
healthy: dynamodb.us-east-1.amazonaws.com ~ $ ^C
I'd have you check the Nomad server logs to see what happened when it registered the gateway, but I can see from your consul config read
that everything looks as I'd expect. Here's what mine looks like (with Consul Enterprise):
$ consul config read -kind terminating-gateway -name ext-dynamodb-tgw
{
"Kind": "terminating-gateway",
"Name": "ext-dynamodb-tgw",
"Services": [
{
"Namespace": "default",
"Name": "ext-dynamodb",
"CAFile": "/etc/ssl/certs/Amazon_Root_CA_1.pem",
"SNI": "dynamodb.us-east-1.amazonaws.com"
}
],
"CreateIndex": 1168,
"ModifyIndex": 1168,
"Partition": "default",
"Namespace": "default"
}
At this point I feel pretty confident that Nomad has configured the gateway as you've requested. I'm going to transfer this issue over to the Consul repository, in hopes that folks there will have a better handle on where to look next.
Nomad version
Consul version
Operating system and Environment details
Issue
I'm attempting to set up a terminating gateway for DynamoDB. The Envoy proxy is started successfully but the dynamic cluster representing the terminating gateway service is never added.
In the Consul / Nomad UIs everything looks good:
however the external service is not accessible through the service mesh.
Reproduction steps
I've uploaded the relevant configuration files to https://github.com/cyclops23/nomad-bug-tgw
Expected Result
The external service should be accessible through the terminating gateway (or some meaningful error message should be provided if there is a problem with the configuration).
Expect to see the dynamic cluster representing the external service to be added to Envoy like this example:
Actual Result
Requests to the external service are routed to the terminating gateway and fail.
Inspecting the Envoy logs shows that the cluster for the gateway is never added via xDS:
Additional config / debug info
From the agent where the terminating gateway is running:
Please let me know if there are additional debugging steps you can suggest, or if you need more information on the issue.