Closed technicianted closed 1 year ago
Found ongoing work in #40136
Looks like there is a serious issue with current grpc-agent
where TLS certificates are not renewed: #38923.
For now I worked around this by creating a custom injection template that merges both grpc-agent
and sidecar
templates. Worked around #38923 by having istio-sidecar
OUTPUT_CERT
in addition to grpc-agent
while having the latter dump its certs in a temporary volume.
Unfortunately now I have to run 2 side-car containers per pod until these issues are resolved.
@costinm in istio can we implement name resolver and by default enable the mixer mode and keep only one template if possible.
I believe that the same bootstrap file can be configured for both proxyless and proxy solution and based on the name resolver like if xds:// it can be decided if the request is to redirected via proxyless or proxy
GCP managed traffic director implements the same thing
@technicianted Hi, I'm very interested in the case of running 2 side-car containers(envoy+proxyless grpc) per pod. But I meet the problem of traffic redirection, could you tell me that how do you change the istio-iptables? Thanks!
If you mean traffic redirection of xds based requests, you can simply add necessary annotations to exclude them by either port or IP addresses.
@technicianted can you give an example on how to run the two setups together.
Use case is that there are some services which are using grpc and some which are using http.
Now if we use envoy proxy here for grpc and http requests, then the pods take extra cpu. In our case the cpu usage is almost half of what the application is taking which is quite expensive.
We are hence migrating our services to use grpc so that we can use grpc proxyless setup and reduce the resource usages and also have load balancing and security capabilities that istio provides.
Currently this is possible by using xds endpoint and adding a template annotation. The deal is that if we use grpc-agent template then envoy proxy is not initialised properly and hence the traffic just pass through.
So we wanted to setup it in such a way that when xds endpoint is used it uses grpc proxyless setup otherwise as directed by the annotations for envoy proxy, so that it gives us time to migrate to grpc as well as at the same time realize the potentials and usages of istio.
Also given there can be certain external services, we will never be able to migrate everything to use grpc hence this will anyways be needed.
It should be possible - but we have not tested or documented this.
With normal sidecar, the agent is generating the grpc bootstrap by default. There are some settings to also save the certificates to file ( can be added in few different ways), and settings to exclude specific ports from interception.
What is missing are docs and tests - everyone is pretty busy and so far this has not been a frequent request, but the design and implementation did consider it.
@anuragagarwal561994 we are running the same exact setup. Before jumping into details you should be aware that grpc-agent side car is broken due to #38923. So our work included two things:
For (1), we created a new template grpc-mixed
where we run the two containers. We needed minor change to avoid port conflicts between the two side cars. Specifically setting PROXY_XDS_DEBUG_VIA_AGENT=false
.
for (2) it was a bit tricky. We needed grpc-agent
for xds, but we needed istio-sidecar
to provide certs and proxying for none-grpc-xds. This required fiddling with volume injection and changing OUTPUT_CERTS
such that certificates from grpc-agent
are ignore and ones from istio-sidecar
are picked up by grpc xds runtime.
Finally, and most importantly, after doing all this, we faced serious scalability issues with grpc xds implementation. If your istio-sidecar
is already taking a lot of resources then it means you have a lot of services in its scope. This will most likely mean that your grpc runtime will end up using more resources. In hour case it was around 2x what Envoy used to take.
Another scalability issue we faced was a hard-coded limit in grpc xds runtime on xds grpc max message size as default. See grpc/grpc-go#5790. We had around 700 services in the scoped namespace. We ended up running a forked version of grpc-go to fix this.
We are hence migrating our services to use grpc so that we can use grpc proxyless setup and reduce the resource usages and also have load balancing and security capabilities that istio provides.
Having understood the above, I strongly suggest you evaluate resource usage of grpc xds runtime before fully committing to the migration.
So we wanted to setup it in such a way that when xds endpoint is used it uses grpc proxyless setup otherwise as directed by the annotations for envoy proxy, so that it gives us time to migrate to grpc as well as at the same time realize the potentials and usages of istio.
We achieved that by excluding designation port for grpc services.
@technicianted so when we say 700 service, does that mean 700 micro services or 700 pods.
What are some of the scalability issues you encountered may be we can learn from the same.
I am doing a POC per service, not running all of them together but they didn't show me any increase in latency or resources while using GRPC proxyless as compared to the setup when I am not using istio at all, and theoretically it should not as well right?
We have a lot of fan-outs and hence the side-car takes a lot of resources.
For now our clients are in Java, will check if we also have the same concern there.
Right now since we are in evaluation phase, I don't think we will require (2) here.
For this
For (1), we created a new template grpc-mixed where we run the two containers. We needed minor change to avoid port conflicts between the two side cars. Specifically setting PROXY_XDS_DEBUG_VIA_AGENT=false.
I tried creating a template on my own, but it was confusing for me a bit.
What I could see was there exists few templates in the config map, I tried to mix and match from them. But couldn't get far enough, every time something else used to break. Can you share the template if that is okay with you, I have also been waiting for the above mentioned issue to get solved in istio itself, so that we don't have to do this.
I was referring to issue https://github.com/istio/istio/pull/40136, seems like it has recently been released. Will check if this setup works now.
@technicianted I tried creating a mixed template, but was stuck at one place, the node ids of both the sidecars were same hence making them conflict with each other.
What are some of the scalability issues you encountered may be we can learn from the same.
I already explained them above but here is at the summary again:
Can you share the template if that is okay with you
Attached based on Istio 1.14. grpc-mixed.yaml.gz This should solve both problems of mixed mode and broken certificate rotation.
@technicianted thanks a lot for this configuration, it is working. I understood what mistake I was doing, I was trying to use the same envoy proxy because I thought that my grpc client only needs bootstrap-file.
But then I came to realise what is happening and why it can't be done with only one sidecar. It helped me better understand istio and envoy setup as well.
Basically there is an xds server which is created in a proxy which envoy listens to in a normal sidecar proxy. In grpc setup, we just don't need bootstrap file generation (which now happens every time since 1.16.x), we also need this XDS server to listen the events from istio.
But if I use the same proxy, means that my client and envoy both will listen from the same XDS server which is present as a UDS socket in /etc/istio/proxy/
. Hence same node id will clash in this case and things will not work properly.
On the other hand, we created 2 sidecars here:
Now since both XDS servers are different, the application is coming up. Just we have to exclude the ports in envoy which are being used by proxyless setup.
Correct me if anything is wrong with my above understanding.
Just one change I did in the template, I added the annotation
sidecar.istio.io/rewriteAppHTTPProbers: "false"
in the istio-grpc-agent
otherwise it was creating a healthcheck that was not working for us.
Correct
🚧 This issue or pull request has been closed due to not having had activity from an Istio team member since 2022-08-06. If you feel this issue or pull request deserves attention, please reopen the issue. Please see this wiki page for more information. Thank you for your contributions.
Created by the issue and PR lifecycle manager.
Describe the feature request When using
grpc-agent
template to migrate to proxyless gRPC, we lose the Envoysidecar
. This means that if the pod is using standard HTTP (incoming or outgoing) we'd lose the mesh capabilities.Are they mutually exclusive?
Describe alternatives you've considered
Nothing yet but perhaps it is conceivable to create custom template that runs both? Not sure if there is a technical blocker there.
Affected product area (please put an X in all that apply)
[ ] Docs [ ] Installation [X] Networking [ ] Performance and Scalability [ ] Extensions and Telemetry [X] Security [ ] Test and Release [X] User Experience [ ] Developer Infrastructure
Affected features (please put an X in all that apply)
[ ] Multi Cluster [ ] Virtual Machine [ ] Multi Control Plane
Additional context