Open knechtionscoding opened 1 year ago
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
This looks cool, I'll be happy to review a PR adding support for this.
@jpkrohling If you can point me in the direction of the code that makes the DNS requests in the otel collector I can work on delivering a PR for it.
Sure, here's a short summary:
That's pretty much it. The current DNS resolver (linked below) just uses the TCP stack to resolve the IP for a name, so it's not directly making a DNS call. You'll likely need to use a go library to perform a SRV query. Unfortunately, I can't recommend one, as I haven't used one yet.
For reference, here's the DNS resolver: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/loadbalancingexporter/resolver_dns.go
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
It looks like the service here is located in a k8s cluster, and the k8s resolver (pr #22776) should be better suited to this scenario
While the Kubernetes service resolver would solve the problem for Kubernetes' deployments, a service record can be used outside of Kubernetes as well. I'll keep this issue open, to gauge interest from other folks.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
It looks like the service here is located in a k8s cluster, and the k8s resolver (pr #22776) should be better suited to this scenario
@fyuan1316 The k8s resolver actually has the same problem. It looks at Endpoints
which are Pod IPs. Istio requires a DNS name which is in the service catalog for it to work. If there was a way via kubernetes API to take a headless service and return back the pod-based service names that get created then it could work.
The static resolver might actually be a potential solution as well. If you were to add some sort of "resolve interval" option to it allowing for more than once resolutions that would also work.
@jpkrohling Really appreciate the guidance. I've been working through a PR and don't foresee any issues implementing a resolver
, but I am worried it might require changes to loadbalancer.go
.
When using an SRV record and your backends are a StatefulSet
there are two different changes to monitor:
Excuse the example if it's unnecessary, but if we have a StatefulSet
called otel-sampler
with 3 replicas
and a headless Service
called otel-sampler-headless
with a port called lb
then we'd have an an SRV record of _lb._tcp.otel-sampler-headless.<namespace>.svc.cluster.local
. This would resolve to:
otel-sampler-0.otel-sampler-headless.<namespace>.svc.cluster.local
otel-sampler-1.otel-sampler-headless.<namespace>.svc.cluster.local
otel-sampler-2.otel-sampler-headless.<namespace>.svc.cluster.local
When one of these pods are restarted they'll get a new IP so we'll need to send any changed records to be re-resolved by the TCP stack.Feel free to correct me here, but I think we have two problems with this. First, the hash ring will be the same and so we won't kick off these two methods. Even if we did something like "on IP change, shuffle the endpoints so a new hash is made" (which feels like it could have its own issues,) we'd still run into issues inside of addMissingExporters
. It examines each endpoint to figure out what is new and nothing will appear as new.
I don't really have a good idea on how to fix this, but I think conceptually it would take implementing some sort of restart concept. If the hash rings are the same, but a restart flag is present on the ring
you still need to proceed. If an endpoint already exists but a restart flag exists on the endpoint
then that exporter needs restarted. I think you can achieve this by inserting a method like removeRestartingExporters
right before addMissingExporters
as this would stop the exporter and make it be listed as "missing" and addMissingExporters
would start it back up.
Is this just a matter of forcing a new DNS resolution for those hosts? If so, the new resolver can keep a map of hosts and last seen IPs, and do a DNS lookup (a simple LookupIPAddr should be sufficient) when changes are detected. This is exclusive to the business of this resolver, I don't think we need a more generic solution at the load balancer level.
I'm not sure, but I think for istio/envoy to work it would just require adding a Host
header with the headless service hostname (e.g.: otel-sampler-headless.<namespace>.svc.cluster.local
). I believe this would work regardless if it's a deployment or a statefulset.
We hit the same issue on our side, but it seems like setting mTLS mode PERMISSIVE made it work with pod-to-pod IP. That said, we do want mTLS, so we will be trying the statefulset workaround...although it doesn't seem to work yet.
Is this just a matter of forcing a new DNS resolution for those hosts? If so, the new resolver can keep a map of hosts and last seen IPs, and do a DNS lookup (a simple LookupIPAddr should be sufficient) when changes are detected. This is exclusive to the business of this resolver, I don't think we need a more generic solution at the load balancer level.
It is a matter of forcing new resolution, however we need to be specific about how the resolution is done. I don't know if I'm using the right terminology here when I say "TCP Stack" or "OS", but here is an example. As @gravufo mentioned, setting the sampler to PERMISSIVE
works and that's how we currently get around it so here is an example with prometheus
instead:
# Using CNAME for demonstration purposes only. Normally would be an SRV
$ dig '_http-web._tcp.prometheus-operated.monitoring.svc.cluster.local' | grep -A3 'ANSWER SECTION' | tail -n 3
_http-web._tcp.prometheus-operated.monitoring.svc.cluster.local. 30 IN CNAME prometheus-main-0.prometheus-operated.monitoring.svc.cluster.local.
prometheus-main-1.prometheus-operated.monitoring.svc.cluster.local. 30 IN A 10.24.129.68
prometheus-main-0.prometheus-operated.monitoring.svc.cluster.local. 30 IN A 10.24.134.21
# Using a hostname instead of an IP works
$ curl -s prometheus-main-0.prometheus-operated.monitoring.svc.cluster.local:9090/api/v1/status/tsdb | jq -r .status
success
# Using an IP fails
$ curl -s 10.24.134.21:9090/api/v1/status/tsdb
upstream connect error or disconnect/reset before headers. reset reason: connection termination
Hopefully that makes sense. What we want to send to onBackendChanges
are the hostnames and not the IPs. However, the hostnames will almost never change.
One solution I thought of which I think avoids changing the LB is by calling callback twice. The first time with the restarting endpoints not provided and the 2nd time as normal.
Here's my branch before writing tests (or testing for that matter.)
I think there are two valid requests here in this ticket at this point:
1) set the Host header on outgoing calls 2) support SRV records
I would treat the first as a bug and should enable mTLS with Istio usage. The second is a feature request, of which I would love to review a PR.
@jpkrohling I'm assuming I am only responsible for #2? I'm hoping to have a PR in the next week or so.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
The linked PR was closed without being merged, I believe it was useful to understand that we need to think about a more complete design for the solution as a whole.
Apologies for the delay @jpkrohling, I haven't been able to get back to the PR and after writing the code we're able to simply run the fork so it's hard to prioritize.
That being said unless I missed something I was waiting for the person who had reviewed it after you to re-review.
I'm okay with keeping it closed and waiting for the larger scoped SRV implementation as long as the feature of only doing one resolution (SRV to A, but not A to IP address,) remains. Do you think that would be acceptable?
Absolutely! I feel like a refactor of the load-balancing exporter is overdue, especially after batching is added to exporters natively. We can use that opportunity to see what we need to change to support this use-case more cleanly.
@jpkrohling I had one other thought on how to fix this and before I go and ask for time in my July or August sprint to tackle this I figured I'd run it by you.
I noticed that headless
Endpoints
list the hostname
in their addresses
(at least that's true when it's the specified Service
on a StatefulSet
.) Rather than trying to make a new resolver, what do you think of just adding an optional config on the k8s resolver which returns back hostnames if the conditions are right?
what do you think of just adding an optional config on the k8s resolver which returns back hostnames if the conditions are right?
That sounds like a good idea, and without double-checking the code, that's what I would intuitively expect this resolver to do.
To summarize and hopefully help someone who came to this thread via a google search.
Currently (e.g. OTel LB collector + backend collector), if istio is involved (e.g. injecting istio sidecar to backend service and the mtls mode is STRICT), this will cause isito-proxy to interrupt the connection and the OTel backend collector will not be able to establish connections properly.
The istio community already has an issue tracker: https://github.com/istio/istio/issues/37431
The istio community has a long term solution: https://github.com/istio/istio/issues/37431#issuecomment-1412831780
https://istio.io/latest/blog/2022/introducing-ambient-mesh/ will support direct to pod
For service invocation scenarios, short-term workaround solutions: https://github.com/istio/istio/issues/37431#issuecomment-1046205209 (This solution is required for http-like calls, in which case you need to pass the headers)
Additionally, if it is possible to allow the mtls to be set to non-STRICT, this will not cause the above issue.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Not sure if a comment is required to be marked not stale, but I submitted a PR which resolves the overall user story.
Using the k8s resolver will work for most, if not all, scenarios. I also agree with the requests made on the SRV solution I worked up. This is a kubernetes solution for a kubernetes problem and will not impact non-k8s environments.
Component(s)
exporter/loadbalancing
Is your feature request related to a problem? Please describe.
I discovered when trying to set configure an otel collector to use the loadbalancing to ship to another otel collector deployment that it fails when istio is involved.
Currently DNS requests made for loadbalancing fail to work with ISTIO services because it it doing an A record look up rather than accepting an SRC
When trying to setup a loadbalancing exporter to talk to a k8s headless service it makes a A record look up, that then means it uses the Pod IP address as the host name, and that fails because ISTIO doesn't allow routing via pod IPs.
This is true even if you use the pod hostname:
Istio proxy logs from the request being made:
Example of loadbalancing config:
It works properly if you configure the receiving otel-collector as a stateful set and then use each pod name, because the SRV record will come back matching.
Describe the solution you'd like
Support a similar option to https://thanos.io/tip/thanos/service-discovery.md/#dns-service-discovery where you can set +dnssrvnoa to allow us to use istio with a deployment of the receiving otel-collector
https://github.com/thanos-io/thanos/commit/432785e77e3cf68af44b0c85bb7d0b68e1ef46ec is the thanos code that does this.
The general ask is that the loadbalancing configuration does the SRV resolve and then from there act as if it is filling in the static section.
Showing the difference:
Notice the SRV part on the end of the second which is what OTEL should accept as an alternative to the A record
Describe alternatives you've considered
Running as a statefulSet works, but doesn't autoscale properly. Right now we have to make do with manually listing out the pods in the stateful set:
Additional context
No response