In our setup we have deployed two namespaces, first x2 and afterwards x3. Both are from a configuration and deployment perspective identical (of course the namespace specific config within the yamls differ accordingly), both have mTLS enabled and a headless service.
In our setup we have one Istio control plane (istio-system) and are trying to do mTLS within the namespaces. Just in case, we are not trying to do mTLS between multiple namespaces.
In the first namespace, x2, mTLS is working as expected.
istioctl authn tls-check galera-cluster-bb55l -n x2 | grep x2.svc
headless.x2.svc.cluster.local:3306 OK STRICT ISTIO_MUTUAL x2/default x2/default
headless.x2.svc.cluster.local:4444 OK STRICT ISTIO_MUTUAL x2/default x2/default
headless.x2.svc.cluster.local:4567 OK STRICT ISTIO_MUTUAL x2/default x2/default
headless.x2.svc.cluster.local:4568 OK STRICT ISTIO_MUTUAL x2/default x2/default
When we deploy x3 with the same configuration as x2, the x3 pods are not able to communicate with each other.
istioctl authn tls-check galera-cluster-24z99 -n x3 | grep x3.svc
headless.x3.svc.cluster.local:3306 OK STRICT ISTIO_MUTUAL x3/default x3/default
headless.x3.svc.cluster.local:4444 OK STRICT ISTIO_MUTUAL x3/default x3/default
headless.x3.svc.cluster.local:4567 OK STRICT ISTIO_MUTUAL x3/default x3/default
headless.x3.svc.cluster.local:4568 OK STRICT ISTIO_MUTUAL x3/default x3/default
A tcpdump revealed that the TLS handshake between the envoy proxies fails with "Certificate Unknown (46)". The reason for this is that in the TLS Client Hello the SNI for x2 is used (outbound.4567._.headless.x2.svc.cluster.local), which is obviously wrong. It seems that the mesh (i use this term on purpose because i don't know which component of it is responsible for this behaviour) uses the first service fqdn that is created for this tcp port. When we delete the x2 namespace the mTLS communication in x3 starts working as expected.
If needed i can provide further configuration and tcpdumps.
We did not find a way to change this behaviour by configuration (different ServiceEntries, DestinationRules etc.) nor did we find a hint in the documentation that this should or should not work.
From an architectural or configuration point of view is this behaviour expected? As for now it seems to me that it is a bug.
Thank you for you support!
Best Regards,
Affected product area (please put an X in all that apply)
[ ] Configuration Infrastructure
[ ] Docs
[ ] Installation
[ x ] Networking
[ ] Performance and Scalability
[ ] Policies and Telemetry
[ x ] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastructure
Expected behavior
The expected behaviour is that mTLS is working within more than one namespace and the same tcp port.
Steps to reproduce the bug
Version (include the output of istioctl version --remote and kubectl version)
How was Istio installed?
RedHat ServiceMesh 1.1.2
Environment where bug was observed (cloud vendor, OS, etc)
RedHat OpenShift 4.3.22 (bare metal)
Additionally, please consider attaching a cluster state archive by attaching
the dump file to this issue.
Bug description Hi,
In our setup we have deployed two namespaces, first x2 and afterwards x3. Both are from a configuration and deployment perspective identical (of course the namespace specific config within the yamls differ accordingly), both have mTLS enabled and a headless service. In our setup we have one Istio control plane (istio-system) and are trying to do mTLS within the namespaces. Just in case, we are not trying to do mTLS between multiple namespaces.
In the first namespace, x2, mTLS is working as expected.
When we deploy x3 with the same configuration as x2, the x3 pods are not able to communicate with each other.
A tcpdump revealed that the TLS handshake between the envoy proxies fails with "Certificate Unknown (46)". The reason for this is that in the TLS Client Hello the SNI for x2 is used (outbound.4567._.headless.x2.svc.cluster.local), which is obviously wrong. It seems that the mesh (i use this term on purpose because i don't know which component of it is responsible for this behaviour) uses the first service fqdn that is created for this tcp port. When we delete the x2 namespace the mTLS communication in x3 starts working as expected. If needed i can provide further configuration and tcpdumps. We did not find a way to change this behaviour by configuration (different ServiceEntries, DestinationRules etc.) nor did we find a hint in the documentation that this should or should not work.
From an architectural or configuration point of view is this behaviour expected? As for now it seems to me that it is a bug.
Thank you for you support!
Best Regards,
Affected product area (please put an X in all that apply)
[ ] Configuration Infrastructure [ ] Docs [ ] Installation [ x ] Networking [ ] Performance and Scalability [ ] Policies and Telemetry [ x ] Security [ ] Test and Release [ ] User Experience [ ] Developer Infrastructure
Expected behavior The expected behaviour is that mTLS is working within more than one namespace and the same tcp port.
Steps to reproduce the bug
Version (include the output of
istioctl version --remote
andkubectl version
)How was Istio installed? RedHat ServiceMesh 1.1.2
Environment where bug was observed (cloud vendor, OS, etc) RedHat OpenShift 4.3.22 (bare metal)
Additionally, please consider attaching a cluster state archive by attaching the dump file to this issue.