Open guilhermef opened 6 months ago
Working on a repro now. Will share an update soon!
Can you provide more info on how you are using an AWS Private CA ? MeshConfig, what are the expectations, etc.
ztunnel by default is using Istiod CA.
Hi @costinm, the istio root CA is a subordinate of the AWS private root CA, we use cert-manager AWS PCA issuer to create the certificate for istio. The only difference between running in kind with the self signed certificate and the AWS PCA was the presence of intermediate certificates.
Ah, so were you using the plugin CA feature of Istio?
Exactly, @keithmattix; we use Istio to connect multiple clusters hosted in AWS. We had no issues when running with sidecars, and the ambient mode was being tested connecting workloads in the same cluster.
Usually it means either that ztunnel is not sending the full chain or that the top root is not properly distributed.
Not sure if we have some debug level that dumps the chain - perhaps a curl -vvv or openssl client connecting to ztunnel can show the chain that is used ( should have 3 certs instead of 2).
If that looks right - probably the root cert is the problem. We should add some debug, istiod does show the root at startup. Normally ztunnel should get all the roots from mesh config too.
@costinm, that's all I got from ztunnel 15008 port.
openssl s_client -connect 10.61.73.159:15008
CONNECTED(00000003)
write:errno=104
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 293 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
This TLS version forbids renegotiation.
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
@costinm, here are the debug logs from a ztunnel pod.
2024-05-18 00:55:41.082 | 2024-05-17T22:55:41.082082Z debug h2::codec::framed_write:xds{id=1}:Connection{peer=Client} send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 5177345 } |
-- | -- | --
| | 2024-05-18 00:55:41.082 | 2024-05-17T22:55:41.082056Z debug hyper_util::client::legacy::pool:xds{id=1} pooling idle connection for ("https", istiod.istio-system.svc:15012) |
| | 2024-05-18 00:55:41.082 | 2024-05-17T22:55:41.082013Z debug h2::codec::framed_write:xds{id=1} send frame=Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384, max_header_list_size: 16384 } |
| | 2024-05-18 00:55:41.082 | 2024-05-17T22:55:41.081995Z debug h2::client:xds{id=1} client connection bound |
| | 2024-05-18 00:55:41.082 | 2024-05-17T22:55:41.081983Z debug h2::client:xds{id=1} binding client connection |
| | 2024-05-18 00:55:41.081 | 2024-05-17T22:55:41.081687Z debug rustls::client::common:xds{id=1} Client auth requested but no cert/sigscheme available |
| | 2024-05-18 00:55:41.081 | 2024-05-17T22:55:41.081670Z debug rustls::client::tls13:xds{id=1} Got CertificateRequest CertificateRequestPayloadTls13 { context: , extensions: [Unknown(UnknownExtension { typ: StatusRequest, payload: }), Unknown(UnknownExtension { typ: SCT, payload: }), SignatureAlgorithms([RSA_PSS_SHA256, ECDSA_NISTP256_SHA256, ED25519, RSA_PSS_SHA384, RSA_PSS_SHA512, RSA_PKCS1_SHA256, RSA_PKCS1_SHA384, RSA_PKCS1_SHA512, ECDSA_NISTP384_SHA384, ECDSA_NISTP521_SHA512, RSA_PKCS1_SHA1, ECDSA_SHA1_Legacy]), AuthorityNames([DistinguishedName(30183116301406035504030c0d53756d757020526f6f74204341)])] } |
| | 2024-05-18 00:55:41.081 | 2024-05-17T22:55:41.081657Z debug rustls::client::hs:xds{id=1} ALPN protocol is Some(b"h2") |
| | 2024-05-18 00:55:41.081 | 2024-05-17T22:55:41.081647Z debug rustls::client::tls13:xds{id=1} TLS1.3 encrypted extensions: [Protocols([ProtocolName(6832)])] |
| | 2024-05-18 00:55:41.081 | 2024-05-17T22:55:41.081541Z debug rustls::client::tls13:xds{id=1} Not resuming |
| | 2024-05-18 00:55:41.081 | 2024-05-17T22:55:41.081520Z debug rustls::client::hs:xds{id=1} Using ciphersuite TLS13_AES_128_GCM_SHA256 |
| | 2024-05-18 00:55:41.078 | 2024-05-17T22:55:41.078873Z debug rustls::client::hs:xds{id=1} Not resuming any session |
| | 2024-05-18 00:55:41.078 | 2024-05-17T22:55:41.078727Z debug rustls::client::hs:xds{id=1} No cached session for DnsName("istiod.istio-system.svc") |
| | 2024-05-18 00:55:41.078 | 2024-05-17T22:55:41.078700Z debug hyper_util::client::legacy::connect::http:xds{id=1} connected to 172.20.100.179:15012 |
| | 2024-05-18 00:55:41.077 | 2024-05-17T22:55:41.077901Z debug hyper_util::client::legacy::connect::http:xds{id=1} connecting to 172.20.100.179:15012 |
| | 2024-05-18 00:55:41.076 | 2024-05-17T22:55:41.076231Z debug hyper_util::client::legacy::connect::dns resolving host="istiod.istio-system.svc" |
| | 2024-05-18 00:55:41.076 | 2024-05-17T22:55:41.076079Z debug rustls::webpki::anchors:xds{id=1} add_parsable_certificates processed 1 valid and 0 invalid certs |
| | 2024-05-18 00:55:41.075 | 2024-05-17T22:55:41.075931Z info readiness Task 'proxy' complete (1.174111ms), still awaiting 2 tasks |
| | 2024-05-18 00:55:41.075 | 2024-05-17T22:55:41.075924Z info hyper_util listener established address=[::]:15020 component="stats" |
| | 2024-05-18 00:55:41.075 | 2024-05-17T22:55:41.075898Z info hyper_util listener established address=127.0.0.1:15000 component="admin" |
| | 2024-05-18 00:55:41.075 | 2024-05-17T22:55:41.075829Z info hyper_util listener established address=[::]:15021 component="readiness" |
| | 2024-05-18 00:55:41.075 | 2024-05-17T22:55:41.075815Z info app in-pod mode enabled |
| | 2024-05-18 00:55:41.075 | 2024-05-17T22:55:41.075530Z debug rustls::webpki::anchors add_parsable_certificates processed 1 valid and 0 invalid certs |
| | 2024-05-18 00:55:41.075 | |
| | 2024-05-18 00:55:41.075 | inpodMark: 1337 |
| | 2024-05-18 00:55:41.075 | inpodPortReuse: true |
| | 2024-05-18 00:55:41.075 | inpodUds: /var/run/ztunnel/ztunnel.sock |
| | 2024-05-18 00:55:41.075 | inpodEnabled: true |
| | 2024-05-18 00:55:41.075 | shuffle_dns_servers: false |
| | 2024-05-18 00:55:41.075 | authentic_data: false |
| | 2024-05-18 00:55:41.075 | recursion_desired: true |
| | 2024-05-18 00:55:41.075 | server_ordering_strategy: QueryStatistics |
| | 2024-05-18 00:55:41.075 | try_tcp_on_error: false |
| | 2024-05-18 00:55:41.075 | preserve_intermediates: true |
| | 2024-05-18 00:55:41.075 | num_concurrent_reqs: 2 |
| | 2024-05-18 00:55:41.075 | negative_max_ttl: null |
| | 2024-05-18 00:55:41.075 | positive_max_ttl: null |
| | 2024-05-18 00:55:41.075 | negative_min_ttl: null |
| | 2024-05-18 00:55:41.075 | positive_min_ttl: null |
| | 2024-05-18 00:55:41.075 | use_hosts_file: true |
| | 2024-05-18 00:55:41.075 | cache_size: 32 |
| | 2024-05-18 00:55:41.075 | ip_strategy: Ipv4thenIpv6 |
| | 2024-05-18 00:55:41.075 | validate: false |
| | 2024-05-18 00:55:41.075 | edns0: false |
| | 2024-05-18 00:55:41.075 | check_names: true |
| | 2024-05-18 00:55:41.075 | rotate: false |
| | 2024-05-18 00:55:41.075 | attempts: 2 |
| | 2024-05-18 00:55:41.075 | nanos: 0 |
| | 2024-05-18 00:55:41.075 | secs: 5 |
| | 2024-05-18 00:55:41.075 | timeout: |
| | 2024-05-18 00:55:41.075 | ndots: 5 |
| | 2024-05-18 00:55:41.075 | dnsResolverOpts: |
| | 2024-05-18 00:55:41.075 | bind_addr: null |
| | 2024-05-18 00:55:41.075 | trust_negative_responses: false |
| | 2024-05-18 00:55:41.075 | tls_dns_name: null |
| | 2024-05-18 00:55:41.075 | protocol: tcp |
| | 2024-05-18 00:55:41.075 | - socket_addr: 172.20.0.10:53 |
| | 2024-05-18 00:55:41.075 | bind_addr: null |
| | 2024-05-18 00:55:41.075 | trust_negative_responses: false |
| | 2024-05-18 00:55:41.075 | tls_dns_name: null |
| | 2024-05-18 00:55:41.075 | protocol: udp |
| | 2024-05-18 00:55:41.075 | - socket_addr: 172.20.0.10:53 |
| | 2024-05-18 00:55:41.075 | name_servers: |
| | 2024-05-18 00:55:41.075 | - eu-west-1.compute.internal |
| | 2024-05-18 00:55:41.075 | - cluster.local |
| | 2024-05-18 00:55:41.075 | - svc.cluster.local |
| | 2024-05-18 00:55:41.075 | - istio-system.svc.cluster.local |
| | 2024-05-18 00:55:41.075 | search: |
| | 2024-05-18 00:55:41.075 | domain: null |
| | 2024-05-18 00:55:41.075 | dnsResolverCfg: |
| | 2024-05-18 00:55:41.075 | proxyArgs: proxy ztunnel |
| | 2024-05-18 00:55:41.075 | enableOriginalSource: null |
| | 2024-05-18 00:55:41.075 | numWorkerThreads: 2 |
| | 2024-05-18 00:55:41.075 | CLUSTER_ID: fleet-sandbox-eu-west-1 |
| | 2024-05-18 00:55:41.075 | ISTIO_VERSION: 1.22.0 |
| | 2024-05-18 00:55:41.075 | DNS_PROXY_ADDR: 127.0.0.1:15053 |
| | 2024-05-18 00:55:41.075 | proxyMetadata: |
| | 2024-05-18 00:55:41.075 | nanos: 0 |
| | 2024-05-18 00:55:41.075 | secs: 5 |
| | 2024-05-18 00:55:41.075 | selfTerminationDeadline: |
| | 2024-05-18 00:55:41.075 | fakeCa: false |
| | 2024-05-18 00:55:41.075 | xdsOnDemand: false |
| | 2024-05-18 00:55:41.075 | nanos: 0 |
| | 2024-05-18 00:55:41.075 | secs: 86400 |
| | 2024-05-18 00:55:41.075 | secretTtl: |
| | 2024-05-18 00:55:41.075 | xdsRootCert: !File ./var/run/secrets/istio/root-cert.pem |
| | 2024-05-18 00:55:41.075 | xdsAddress: https://istiod.istio-system.svc:15012 |
| | 2024-05-18 00:55:41.075 | caRootCert: !File ./var/run/secrets/istio/root-cert.pem |
| | 2024-05-18 00:55:41.075 | caAddress: https://istiod.istio-system.svc:15012 |
| | 2024-05-18 00:55:41.075 | clusterDomain: cluster.local |
| | 2024-05-18 00:55:41.075 | clusterId: fleet-sandbox-eu-west-1 |
| | 2024-05-18 00:55:41.075 | localIp: 10.61.86.165 |
| | 2024-05-18 00:55:41.075 | proxyMode: Shared |
| | 2024-05-18 00:55:41.075 | localNode: ip-10-61-94-2.eu-west-1.compute.internal |
| | 2024-05-18 00:55:41.075 | network: '' |
| | 2024-05-18 00:55:41.075 | dnsProxyAddr: 127.0.0.1:15053 |
| | 2024-05-18 00:55:41.075 | outboundAddr: '[::]:15001' |
| | 2024-05-18 00:55:41.075 | inboundPlaintextAddr: '[::]:15006' |
| | 2024-05-18 00:55:41.075 | inboundAddr: '[::]:15008' |
| | 2024-05-18 00:55:41.075 | readinessAddr: '[::]:15021' |
| | 2024-05-18 00:55:41.075 | statsAddr: '[::]:15020' |
| | 2024-05-18 00:55:41.075 | adminAddr: 127.0.0.1:15000 |
| | 2024-05-18 00:55:41.075 | socks5Addr: null |
| | 2024-05-18 00:55:41.075 | nanos: 0 |
| | 2024-05-18 00:55:41.075 | secs: 300 |
| | 2024-05-18 00:55:41.075 | poolUnusedReleaseTimeout: |
| | 2024-05-18 00:55:41.075 | poolMaxStreamsPerConn: 100 |
| | 2024-05-18 00:55:41.075 | frameSize: 1048576 |
| | 2024-05-18 00:55:41.075 | connectionWindowSize: 4194304 |
| | 2024-05-18 00:55:41.075 | windowSize: 4194304 |
| | 2024-05-18 00:55:41.075 | dnsProxy: false |
| | 2024-05-18 00:55:41.075 | 2024-05-17T22:55:41.075306Z info ztunnel running with config: proxy: true |
| | 2024-05-18 00:55:41.075 | 2024-05-17T22:55:41.075092Z info ztunnel version: version.BuildInfo{Version:"909bf991d01edc4db51265bc633acfe303555ef5", GitRevision:"909bf991d01edc4db51265bc633acfe303555ef5", RustVersion:"1.77.2", BuildProfile:"release", BuildStatus:"Clean", GitTag:"1.22.0-beta.1-6-g909bf99", IstioVersion:"unknown"}
Deja vue - but we'll need to reproduce and add more info ( and fix the bug ). Are you using certmanager ? Can you include the configs you used to setup the certs - install options, crs, etc ?
There are unfortunately many ways to setup the roots and CA - would help to reproduce yours (or cluse enough).
On Fri, May 17, 2024, 16:00 Guilherme Souza @.***> wrote:
@costinm https://github.com/costinm, here are the debug logs from a ztunnel pod.
2024-05-18 00:55:41.082 2024-05-17T22:55:41.082082Z debug h2::codec::framed_write:xds{id=1}:Connection{peer=Client} send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 5177345 } 2024-05-18 00:55:41.082 2024-05-17T22:55:41.082056Z debug hyper_util::client::legacy::pool:xds{id=1} pooling idle connection for ("https", istiod.istio-system.svc:15012) 2024-05-18 00:55:41.082 2024-05-17T22:55:41.082013Z debug h2::codec::framed_write:xds{id=1} send frame=Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384, max_header_list_size: 16384 } 2024-05-18 00:55:41.082 2024-05-17T22:55:41.081995Z debug h2::client:xds{id=1} client connection bound 2024-05-18 00:55:41.082 2024-05-17T22:55:41.081983Z debug h2::client:xds{id=1} binding client connection 2024-05-18 00:55:41.081 2024-05-17T22:55:41.081687Z debug rustls::client::common:xds{id=1} Client auth requested but no cert/sigscheme available 2024-05-18 00:55:41.081 2024-05-17T22:55:41.081670Z debug rustls::client::tls13:xds{id=1} Got CertificateRequest CertificateRequestPayloadTls13 { context: , extensions: [Unknown(UnknownExtension { typ: StatusRequest, payload: }), Unknown(UnknownExtension { typ: SCT, payload: }), SignatureAlgorithms([RSA_PSS_SHA256, ECDSA_NISTP256_SHA256, ED25519, RSA_PSS_SHA384, RSA_PSS_SHA512, RSA_PKCS1_SHA256, RSA_PKCS1_SHA384, RSA_PKCS1_SHA512, ECDSA_NISTP384_SHA384, ECDSA_NISTP521_SHA512, RSA_PKCS1_SHA1, ECDSA_SHA1_Legacy]), AuthorityNames([DistinguishedName(30183116301406035504030c0d53756d757020526f6f74204341)])] } 2024-05-18 00:55:41.081 2024-05-17T22:55:41.081657Z debug rustls::client::hs:xds{id=1} ALPN protocol is Some(b"h2") 2024-05-18 00:55:41.081 2024-05-17T22:55:41.081647Z debug rustls::client::tls13:xds{id=1} TLS1.3 encrypted extensions: [Protocols([ProtocolName(6832)])] 2024-05-18 00:55:41.081 2024-05-17T22:55:41.081541Z debug rustls::client::tls13:xds{id=1} Not resuming 2024-05-18 00:55:41.081 2024-05-17T22:55:41.081520Z debug rustls::client::hs:xds{id=1} Using ciphersuite TLS13_AES_128_GCM_SHA256 2024-05-18 00:55:41.078 2024-05-17T22:55:41.078873Z debug rustls::client::hs:xds{id=1} Not resuming any session 2024-05-18 00:55:41.078 2024-05-17T22:55:41.078727Z debug rustls::client::hs:xds{id=1} No cached session for DnsName("istiod.istio-system.svc") 2024-05-18 00:55:41.078 2024-05-17T22:55:41.078700Z debug hyper_util::client::legacy::connect::http:xds{id=1} connected to 172.20.100.179:15012 2024-05-18 00:55:41.077 2024-05-17T22:55:41.077901Z debug hyper_util::client::legacy::connect::http:xds{id=1} connecting to 172.20.100.179:15012 2024-05-18 00:55:41.076 2024-05-17T22:55:41.076231Z debug hyper_util::client::legacy::connect::dns resolving host="istiod.istio-system.svc" 2024-05-18 00:55:41.076 2024-05-17T22:55:41.076079Z debug rustls::webpki::anchors:xds{id=1} add_parsable_certificates processed 1 valid and 0 invalid certs 2024-05-18 00:55:41.075 2024-05-17T22:55:41.075931Z info readiness Task 'proxy' complete (1.174111ms), still awaiting 2 tasks 2024-05-18 00:55:41.075 2024-05-17T22:55:41.075924Z info hyper_util listener established address=[::]:15020 component="stats" 2024-05-18 00:55:41.075 2024-05-17T22:55:41.075898Z info hyper_util listener established address=127.0.0.1:15000 component="admin" 2024-05-18 00:55:41.075 2024-05-17T22:55:41.075829Z info hyper_util listener established address=[::]:15021 component="readiness" 2024-05-18 00:55:41.075 2024-05-17T22:55:41.075815Z info app in-pod mode enabled 2024-05-18 00:55:41.075 2024-05-17T22:55:41.075530Z debug rustls::webpki::anchors add_parsable_certificates processed 1 valid and 0 invalid certs 2024-05-18 00:55:41.075 2024-05-18 00:55:41.075 inpodMark: 1337 2024-05-18 00:55:41.075 inpodPortReuse: true 2024-05-18 00:55:41.075 inpodUds: /var/run/ztunnel/ztunnel.sock 2024-05-18 00:55:41.075 inpodEnabled: true 2024-05-18 00:55:41.075 shuffle_dns_servers: false 2024-05-18 00:55:41.075 authentic_data: false 2024-05-18 00:55:41.075 recursion_desired: true 2024-05-18 00:55:41.075 server_ordering_strategy: QueryStatistics 2024-05-18 00:55:41.075 try_tcp_on_error: false 2024-05-18 00:55:41.075 preserve_intermediates: true 2024-05-18 00:55:41.075 num_concurrent_reqs: 2 2024-05-18 00:55:41.075 negative_max_ttl: null 2024-05-18 00:55:41.075 positive_max_ttl: null 2024-05-18 00:55:41.075 negative_min_ttl: null 2024-05-18 00:55:41.075 positive_min_ttl: null 2024-05-18 00:55:41.075 use_hosts_file: true 2024-05-18 00:55:41.075 cache_size: 32 2024-05-18 00:55:41.075 ip_strategy: Ipv4thenIpv6 2024-05-18 00:55:41.075 validate: false 2024-05-18 00:55:41.075 edns0: false 2024-05-18 00:55:41.075 check_names: true 2024-05-18 00:55:41.075 rotate: false 2024-05-18 00:55:41.075 attempts: 2 2024-05-18 00:55:41.075 nanos: 0 2024-05-18 00:55:41.075 secs: 5 2024-05-18 00:55:41.075 timeout: 2024-05-18 00:55:41.075 ndots: 5 2024-05-18 00:55:41.075 dnsResolverOpts: 2024-05-18 00:55:41.075 bind_addr: null 2024-05-18 00:55:41.075 trust_negative_responses: false 2024-05-18 00:55:41.075 tls_dns_name: null 2024-05-18 00:55:41.075 protocol: tcp 2024-05-18 00:55:41.075 - socket_addr: 172.20.0.10:53 2024-05-18 00:55:41.075 bind_addr: null 2024-05-18 00:55:41.075 trust_negative_responses: false 2024-05-18 00:55:41.075 tls_dns_name: null 2024-05-18 00:55:41.075 protocol: udp 2024-05-18 00:55:41.075 - socket_addr: 172.20.0.10:53 2024-05-18 00:55:41.075 name_servers: 2024-05-18 00:55:41.075 - eu-west-1.compute.internal 2024-05-18 00:55:41.075 - cluster.local 2024-05-18 00:55:41.075 - svc.cluster.local 2024-05-18 00:55:41.075 - istio-system.svc.cluster.local 2024-05-18 00:55:41.075 search: 2024-05-18 00:55:41.075 domain: null 2024-05-18 00:55:41.075 dnsResolverCfg: 2024-05-18 00:55:41.075 proxyArgs: proxy ztunnel 2024-05-18 00:55:41.075 enableOriginalSource: null 2024-05-18 00:55:41.075 numWorkerThreads: 2 2024-05-18 00:55:41.075 CLUSTER_ID: fleet-sandbox-eu-west-1 2024-05-18 00:55:41.075 ISTIO_VERSION: 1.22.0 2024-05-18 00:55:41.075 DNS_PROXY_ADDR: 127.0.0.1:15053 2024-05-18 00:55:41.075 proxyMetadata: 2024-05-18 00:55:41.075 nanos: 0 2024-05-18 00:55:41.075 secs: 5 2024-05-18 00:55:41.075 selfTerminationDeadline: 2024-05-18 00:55:41.075 fakeCa: false 2024-05-18 00:55:41.075 xdsOnDemand: false 2024-05-18 00:55:41.075 nanos: 0 2024-05-18 00:55:41.075 secs: 86400 2024-05-18 00:55:41.075 secretTtl: 2024-05-18 00:55:41.075 xdsRootCert: !File ./var/run/secrets/istio/root-cert.pem 2024-05-18 00:55:41.075 xdsAddress: https://istiod.istio-system.svc:15012 2024-05-18 00:55:41.075 caRootCert: !File ./var/run/secrets/istio/root-cert.pem 2024-05-18 00:55:41.075 caAddress: https://istiod.istio-system.svc:15012 2024-05-18 00:55:41.075 clusterDomain: cluster.local 2024-05-18 00:55:41.075 clusterId: fleet-sandbox-eu-west-1 2024-05-18 00:55:41.075 localIp: 10.61.86.165 2024-05-18 00:55:41.075 proxyMode: Shared 2024-05-18 00:55:41.075 localNode: ip-10-61-94-2.eu-west-1.compute.internal 2024-05-18 00:55:41.075 network: '' 2024-05-18 00:55:41.075 dnsProxyAddr: 127.0.0.1:15053 2024-05-18 00:55:41.075 outboundAddr: '[::]:15001' 2024-05-18 00:55:41.075 inboundPlaintextAddr: '[::]:15006' 2024-05-18 00:55:41.075 inboundAddr: '[::]:15008' 2024-05-18 00:55:41.075 readinessAddr: '[::]:15021' 2024-05-18 00:55:41.075 statsAddr: '[::]:15020' 2024-05-18 00:55:41.075 adminAddr: 127.0.0.1:15000 2024-05-18 00:55:41.075 socks5Addr: null 2024-05-18 00:55:41.075 nanos: 0 2024-05-18 00:55:41.075 secs: 300 2024-05-18 00:55:41.075 poolUnusedReleaseTimeout: 2024-05-18 00:55:41.075 poolMaxStreamsPerConn: 100 2024-05-18 00:55:41.075 frameSize: 1048576 2024-05-18 00:55:41.075 connectionWindowSize: 4194304 2024-05-18 00:55:41.075 windowSize: 4194304 2024-05-18 00:55:41.075 dnsProxy: false 2024-05-18 00:55:41.075 2024-05-17T22:55:41.075306Z info ztunnel running with config: proxy: true 2024-05-18 00:55:41.075 2024-05-17T22:55:41.075092Z info ztunnel version: version.BuildInfo{Version:"909bf991d01edc4db51265bc633acfe303555ef5", GitRevision:"909bf991d01edc4db51265bc633acfe303555ef5", RustVersion:"1.77.2", BuildProfile:"release", BuildStatus:"Clean", GitTag:"1.22.0-beta.1-6-g909bf99", IstioVersion:"unknown"} — Reply to this email directly, view it on GitHub https://github.com/istio/ztunnel/issues/1061#issuecomment-2118470432, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAUR2XNQEKQAALWLKC427DZC2DZRAVCNFSM6AAAAABHZ3NTZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJYGQ3TANBTGI . You are receiving this because you were mentioned.Message ID: @.***>
Yes, @costinm, we're using cert-manager.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: istio-ca
namespace: istio-system
spec:
commonName: istio-ca
duration: 720h0m0s
isCA: true
issuerRef:
group: awspca.cert-manager.io
kind: AWSPCAIssuer
name: aws-pca-istio-issuer
renewBefore: 168h0m0s
secretName: cacerts
subject:
organizations:
- cluster.local
- cert-manager
status:
conditions:
- lastTransitionTime: "2024-04-11T12:16:39Z"
message: Certificate is up to date and has not expired
observedGeneration: 1
reason: Ready
status: "True"
type: Ready
notAfter: "2024-06-03T12:16:36Z"
notBefore: "2024-05-04T11:16:36Z"
renewalTime: "2024-05-27T12:16:36Z"
revision: 2
apiVersion: awspca.cert-manager.io/v1beta1
kind: AWSPCAIssuer
metadata:
name: aws-pca-istio-issuer
namespace: istio-system
spec:
arn: arn:aws:acm-pca:eu-west-1:<redacted>:certificate-authority/<redacted>
region: eu-west-1
status:
conditions:
- lastTransitionTime: "2024-04-11T12:16:36Z"
message: Issuer verified
reason: Verified
status: "True"
type: Ready
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: istio-ca
namespace: istio-system
spec:
ca:
secretName: cacerts
status:
conditions:
- lastTransitionTime: "2024-04-11T12:16:41Z"
message: Signing CA verified
observedGeneration: 1
reason: KeyPairVerified
status: "True"
type: Ready
Ok, took a bit of a look at the code - there are some problems with cacerts if tls.key is used and multiple intermediaries are used ( handled correctly on the old code path ) - but for this case it should work (the 'intermediary' chain is created by only 2 certs, since CertManager doesn't have a way to pass a longer chain AFAIK.
Can you check the cert created by AWS and make sure it's just a root + a cert ( not a longer chain ) ? The code is pretty convoluted - you may want to create an 'old style' cacerts manually (using cert-chain.pem has the entire chain starting with the CA cert associated with the key and all the way to the real root ). Again - this is in the case the chain has more than 2 elements, with 2 I think the code should work the same in both cases.
I looked a bit at the code in ztunnel - I'm not very good at rust, but it looks like caclient.rs is handling the cert response expecting a list of PEM files and does the right thing - can't repro the AWS case, but if you see more than 2 certs in the chains it's very likely to be the problem, and you can confirm by using the old-style cert-chain.pem
If you only see 2 certs in the AWS chain ( the top root and the intermediary ) - we'll need more debugging. Unfortunately the code path loading from Secret doesn't work the same, so harder to reproduce in a debugger.
@costinm, that's how the secret is created by cert-manager when using AWS PCA:
ca.crt: |
-----BEGIN CERTIFICATE-----
<cert-data>
-----END CERTIFICATE-----
tls.crt: |
-----BEGIN CERTIFICATE-----
<cert-data>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<cert-data>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<cert-data>
-----END CERTIFICATE-----
tls.key: |
-----BEGIN RSA PRIVATE KEY-----
<cert-data>
-----END RSA PRIVATE KEY-----
We do have 3 certs in tls.crt
, I think that's the issue that you mentioned, right?
cert-manager does offer a way to create a combined .pem file, but it's in alpha: https://cert-manager.io/docs/usage/certificate/#combinedpem
@costinm, I just tested with a root CA, instead of the subordinate CA that we were using previously, and it works. sadly, In our case, we cannot switch to a root CA
@costinm, that's how the secret is created by cert-manager when using AWS PCA:
ca.crt: | -----BEGIN CERTIFICATE----- <cert-data> -----END CERTIFICATE----- tls.crt: | -----BEGIN CERTIFICATE----- <cert-data> -----END CERTIFICATE----- -----BEGIN CERTIFICATE----- <cert-data> -----END CERTIFICATE----- -----BEGIN CERTIFICATE----- <cert-data> -----END CERTIFICATE----- tls.key: | -----BEGIN RSA PRIVATE KEY----- <cert-data> -----END RSA PRIVATE KEY-----
We do have 3 certs in
tls.crt
, I think that's the issue that you mentioned, right?
To confirm, is this the data stored in the cacerts
secret? Do you have 2 or 3 intermediate certificates?
I was able to create deeper intermediary with certmanager and it seems to also work.
Need to double check the config.
Are you sure the top level root ( last one in the chain ) is set in the secret and replicated ?
I was able to create deeper intermediary with certmanager and it seems to also work.
Need to double check the config.
Are you sure the top level root ( last one in the chain ) is set in the secret and replicated ?
When you inspect your cacerts
secret how many certs do you have in tls.crt
? If I understand the cert-manager documentation correctly, the root cert is not included in the tls.crt
https://cert-manager.io/docs/usage/certificate/#target-secret.
Hi @jaellio and @costinm, yes, the YAML I shared previously is the contents of cacerts
secret.
The current setup is Main CA (Root) -> Istio CA (Subordinate) -> Istio Env CA (Subordinate)
, that's why we have 3 certs in tls.crt
.
I just tested another setup, with Main CA (Root) -> Istio Test CA (Subordinate)
, in this case the cacerts
file looked like this:
ca.crt: |
-----BEGIN CERTIFICATE-----
<Root CA>
-----END CERTIFICATE-----
tls.crt: |
-----BEGIN CERTIFICATE-----
<cert-data>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<cert-data>
-----END CERTIFICATE-----
tls.key: |
-----BEGIN RSA PRIVATE KEY-----
<cert-data>
-----END RSA PRIVATE KEY-----
The ca.crt
always contains the top-level CA.
On this second case, the result was the same, we also got an invalid peer certificate
.
I think chained certs are working fine...
Cany you check: kubectl -n istio-system get cm istio-ca-root-cert -o yaml kubectl get secret cacerts -n istio-system -o "jsonpath={.data['ca.crt']}" |base64 -d
and make sure it's the same - and the Main CA root you use ?
To step back - unknown Issuer means something in the chain is not known or trusted. Original guess was that the chain is bad (missing certs), but it could be that the roots are wrong. Istio does send the full chain including the top root (never understood why...) - but the actual check is against the istio-ca-root-cert which is a list of roots - and must include the root that signed the chain.
Hi @costinm, I just checked the istio-ca-root-cert
, it only contains the ca.crt
data, the Main CA root, not the full chain.
I 'm adding more debug info on what CAs are loaded and fixing the inconsistencies - but not sure I can help more with your case, for the intermediary I created with CertManager it seems fine.
Can you repeat the 'openssl s_client -connect 10.61.73.159:15008' command - but make sure you are connecting to a pod IP ( not ztunnel IP ), and from a VM or pod that is not ambient enabled ?
Checked with John - ztunnel IP doesn't know what cert to return, it relies on detecting the intended pod and serving its certificate.
@costinm, now I was able to get the certificate data:
Connecting to 10.61.72.122
CONNECTED(00000003)
Can't use SSL_get_servername
depth=1 O=cert-manager + O=cluster.local, CN=istio-ca
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0
verify return:1
---
Certificate chain
0 s:
i:O=cert-manager + O=cluster.local, CN=istio-ca
a:PKEY: id-ecPublicKey, 256 (bit); sigalg: RSA-SHA256
v:NotBefore: May 30 20:56:54 2024 GMT; NotAfter: May 31 20:58:54 2024 GMT
1 s:O=cert-manager + O=cluster.local, CN=istio-ca
i:O=Company, OU=Platform, CN=Company Istio sandbox CA
a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256
v:NotBefore: May 24 04:18:20 2024 GMT; NotAfter: Jun 23 05:18:20 2024 GMT
---
Server certificate
-----BEGIN CERTIFICATE-----
<CERT_DATA>
-----END CERTIFICATE-----
subject=
issuer=O=cert-manager + O=cluster.local, CN=istio-ca
---
Acceptable client certificate CA names
CN=Company Root CA
Requested Signature Algorithms: ECDSA+SHA384:ECDSA+SHA256:Ed25519:RSA-PSS+SHA512:RSA-PSS+SHA384:RSA-PSS+SHA256:RSA+SHA512:RSA+SHA384:RSA+SHA256
Shared Requested Signature Algorithms: ECDSA+SHA384:ECDSA+SHA256:Ed25519:RSA-PSS+SHA512:RSA-PSS+SHA384:RSA-PSS+SHA256:RSA+SHA512:RSA+SHA384:RSA+SHA256
Peer signing digest: SHA256
Peer signature type: ECDSA
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 2209 bytes and written 409 bytes
Verification error: unable to get local issuer certificate
---
New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
Server public key is 256 bit
This TLS version forbids renegotiation.
No ALPN negotiated
Early data was not sent
Verify return code: 20 (unable to get local issuer certificate)
---
287B1D54FD7E0000:error:0A00045C:SSL routines:ssl3_read_bytes:tlsv13 alert certificate required:ssl/record/rec_layer_s3.c:907:SSL alert number 116
We're getting somewhere. Looks like it's too short - missing a bunch of certs.
Will take a look again at the code - missing the full chain was my first guess, but when I tested it seemed to work.
@costinm based on the chain that I got from the istio secret, it's missing the Istio CA.
Our chain looks like this Main CA (Root) -> Istio CA (Subordinate) -> Istio Env CA (Subordinate)
Based on this, we only got the Main CA
and the Istio ENV CA
On the Istio secret, it contains the three certificates.
Sorry - long thread maybe I missed it, there was a question on which secret you use - istio-ca-secret or cacerts ? Different code paths ...
@costinm, we're using cacerts
Currently running into the same error, and what appears to be the same issue (thanks @howardjohn for linking me here). My setup consists of:
cacerts
secret in istio-system
namespaceThe cacerts
secret contains the following structure:
# kubectl get secret cacerts -n istio-system -o yaml
apiVersion: v1
data:
ca.crt:
-----BEGIN CERTIFICATE-----
# root CA cert
-----END CERTIFICATE-----
tls.crt:
-----BEGIN CERTIFICATE-----
# istiod CA cert
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
# intermediate CA cert
-----END CERTIFICATE-----
tls.key:
# ...
kind: Secret
type: kubernetes.io/tls
The istio-ca-root-cert
secret contains the correct root CA certificate (serial number matches that of cacerts/ca.crt
):
apiVersion: v1
data:
root-cert.pem: |
-----BEGIN CERTIFICATE-----
# root CA cert
-----END CERTIFICATE-----
kind: ConfigMap
metadata:
labels:
istio.io/config: "true"
name: istio-ca-root-cert
namespace: istio-system
The following is output of s_client
from a non-ambient pod to a pod in an ambient-enabled namespace:
root@example-app-6b9667f579-56kpp:/app# openssl s_client -connect 10.54.117.136:15008
CONNECTED(00000003)
Can't use SSL_get_servername
depth=1 O = cert-manager + O = cluster.local, CN = istio-ca
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0
verify return:1
---
Certificate chain
0 s:
i:O = cert-manager + O = cluster.local, CN = istio-ca
a:PKEY: id-ecPublicKey, 256 (bit); sigalg: RSA-SHA256
v:NotBefore: Jun 12 19:27:47 2024 GMT; NotAfter: Jun 13 19:29:47 2024 GMT
1 s:O = cert-manager + O = cluster.local, CN = istio-ca
i:C = US, O = Company, OU = Redacted, ST = Redacted, CN = Intermediate CA, L = Redacted
a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA512
v:NotBefore: Jun 11 05:38:55 2024 GMT; NotAfter: Jun 17 06:38:54 2024 GMT
---
Server certificate
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
subject=
issuer=O = cert-manager + O = cluster.local, CN = istio-ca
---
Acceptable client certificate CA names
C = US, O = Company, OU = Team, ST = Redacted, CN = Root CA, L = Redacted
Requested Signature Algorithms: ECDSA+SHA384:ECDSA+SHA256:Ed25519:RSA-PSS+SHA512:RSA-PSS+SHA384:RSA-PSS+SHA256:RSA+SHA512:RSA+SHA384:RSA+SHA256
Shared Requested Signature Algorithms: ECDSA+SHA384:ECDSA+SHA256:Ed25519:RSA-PSS+SHA512:RSA-PSS+SHA384:RSA-PSS+SHA256:RSA+SHA512:RSA+SHA384:RSA+SHA256
Peer signing digest: SHA256
Peer signature type: ECDSA
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 2616 bytes and written 407 bytes
Verification error: unable to get local issuer certificate
---
New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
Server public key is 256 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 20 (unable to get local issuer certificate)
---
402765E2CF7F0000:error:0A00045C:SSL routines:ssl3_read_bytes:tlsv13 alert certificate required:../ssl/record/rec_layer_s3.c:1586:SSL alert number 116
root@example-app-6b9667f579-56kpp:/app#
Here is the output of istioctl x ztunnel-config certificate $ZTUNNEL_POD.istio-system
:
CERTIFICATE NAME TYPE STATUS VALID CERT SERIAL NUMBER NOT AFTER NOT BEFORE
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-details Leaf Available true 94456cd46f3c3f9a5afa894be2bfdd40 2024-06-13T19:29:47Z 2024-06-12T19:27:47Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-details Intermediate Available true d96c911500b5a200f04cb374d56415dc 2024-06-17T06:38:54Z 2024-06-11T05:38:55Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-details Root Available true 15801d95b7ea9a1ce11dd93f5f9566e9 2034-04-12T14:49:27Z 2024-04-12T13:49:27Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-productpage Leaf Available true a955244458ff2d9ca1dc56b9fcc5cff3 2024-06-13T19:29:47Z 2024-06-12T19:27:47Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-productpage Intermediate Available true d96c911500b5a200f04cb374d56415dc 2024-06-17T06:38:54Z 2024-06-11T05:38:55Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-productpage Root Available true 15801d95b7ea9a1ce11dd93f5f9566e9 2034-04-12T14:49:27Z 2024-04-12T13:49:27Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-ratings Leaf Available true 91530e6b0578fb7a89cd45c34b2835e8 2024-06-13T19:29:47Z 2024-06-12T19:27:47Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-ratings Intermediate Available true d96c911500b5a200f04cb374d56415dc 2024-06-17T06:38:54Z 2024-06-11T05:38:55Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-ratings Root Available true 15801d95b7ea9a1ce11dd93f5f9566e9 2034-04-12T14:49:27Z 2024-04-12T13:49:27Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-reviews Leaf Available true 2eb3e001d554775ba6d1faec71c06484 2024-06-13T19:29:47Z 2024-06-12T19:27:47Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-reviews Intermediate Available true d96c911500b5a200f04cb374d56415dc 2024-06-17T06:38:54Z 2024-06-11T05:38:55Z
spiffe://cluster.local/ns/bookinfo/sa/bookinfo-reviews Root Available true 15801d95b7ea9a1ce11dd93f5f9566e9 2034-04-12T14:49:27Z 2024-04-12T13:49:27Z
From my reading of the s_client
output, the HBONE mTLS tunnel port is not providing the full chain (workload cert + istiod CA cert + intermediate CA), only workload cert + istiod CA cert.
Thanks for investigating this! I can try to create the cert-chain.pem
format of cacerts
secret if that would be helpful.
Happy to provide debug logs or additional details.
@costinm could you please clarify what the old-style cert-chain.pem
workaround would look like based https://github.com/istio/ztunnel/issues/1061#issuecomment-2123801655?
The code is pretty convoluted - you may want to create an 'old style' cacerts manually (using cert-chain.pem has the entire chain starting with the CA cert associated with the key and all the way to the real root ). Again - this is in the case the chain has more than 2 elements, with 2 I think the code should work the same in both cases.
I looked a bit at the code in ztunnel - I'm not very good at rust, but it looks like caclient.rs is handling the cert response expecting a list of PEM files and does the right thing - can't repro the AWS case, but if you see more than 2 certs in the chains it's very likely to be the problem, and you can confirm by using the old-style cert-chain.pem
Specifically, will istio use cert-chain.pem
in addition to the tls.crt
, ca.crt
if they both exist in cacerts
secret?
I'm considering testing this but don't want to blow away the existing secret. Hoping I can just use cert-manager
(additionalOutputFormats) or trust-manager
to add an old-style format in addition.
I'm testing ztunnel on 1.22 release. I have 2 book example apps deployed, one with sidecar and one with ambient. The sidecar one is working, but with the ambient one I get
io error: invalid peer certificate: UnknownIssuer
. We're using an AWS Private CA.ztunnel logs:
istioctl x ztunnel-config certificates ztunnel-9tkd9.istio-system
istioctl x ztunnel-config certificates ztunnel-dxgrd.istio-system
If I run ambient locally, with the certificate created by Istio, there's no intermediate certificate.
Update: There's an issue with zTunnel when using AWS PCA and cert-manager, with a subordinate CA.