Open kidiyoor opened 7 years ago
This is an envoy config issue. Can you get the envoy config file?
Find bellow/etc/istio/proxy/envoy-rev13.json.
{
"listeners": [],
"lds": {
"cluster": "lds",
"refresh_delay_ms": 1000
},
"admin": {
"access_log_path": "/dev/stdout",
"address": "tcp://127.0.0.1:15000"
},
"cluster_manager": {
"clusters": [
{
"name": "rds",
"connect_timeout_ms": 1000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://istio-pilot:8080"
}
]
},
{
"name": "lds",
"connect_timeout_ms": 1000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://istio-pilot:8080"
}
]
}
],
"sds": {
"cluster": {
"name": "sds",
"connect_timeout_ms": 1000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://istio-pilot:8080"
}
]
},
"refresh_delay_ms": 1000
},
"cds": {
"cluster": {
"name": "cds",
"connect_timeout_ms": 1000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://istio-pilot:8080"
}
]
},
"refresh_delay_ms": 1000
}
}
}
Oh, sorry. It seems your pilot is pushing malformed CDS information to Envoy. Actually we have not provided auth support for Consul yet. I doubt the error is because poilot is not getting the "service account" information for the service from consul so it failed to construct a valid ssl_context. Can you call the pilot CDS API to get the CDS information, and paste it here?
I see that verify_subject_alt_name
is null, I guess it need to have service account info ? In kubernetes world - yes pilot can get this info from apiserver. How does/will it work with vm/node_agent/external_service_registry(say consul) setup ?
root@istio-cp-demo:/home/gauthamvk/istio-pilot# curl localhost:8080/v1/clusters/service_7711/sidecar~10.128.0.6~istio-worker-demo-1.default~default.svc.cluster.local
{
"clusters": [
{
"name": "in.7711",
"connect_timeout_ms": 1000,
"type": "static",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://127.0.0.1:7711"
}
]
},
{
"name": "out.00434c7558ea95a97282a1773370cd4d6e375606",
"service_name": "service_7711.service.consul|http",
"connect_timeout_ms": 1000,
"type": "sds",
"lb_type": "round_robin",
"ssl_context": {
"cert_chain_file": "/etc/certs/cert-chain.pem",
"private_key_file": "/etc/certs/key.pem",
"ca_cert_file": "/etc/certs/root-cert.pem",
"verify_subject_alt_name": null
}
},
{
"name": "out.470d4e4dfa2dbf0b874ca930a7b52ba2aebdc967",
"service_name": "consul.service.consul|http",
"connect_timeout_ms": 1000,
"type": "sds",
"lb_type": "round_robin",
"ssl_context": {
"cert_chain_file": "/etc/certs/cert-chain.pem",
"private_key_file": "/etc/certs/key.pem",
"ca_cert_file": "/etc/certs/root-cert.pem",
"verify_subject_alt_name": null
}
}
]
}
The 'san' should be retrieved from apiserver - we need to make sure each service is created in k8s as well, and has all the info we need. This will be critical for hybrid solutions and migrations.
First thing to try is simply create a service with same name and with service account annotation in k8s. If it doesn't work - we need to make sure pilot can merge service info, or at least treat k8s as authoritative. Consul should be used to find endpoints - not full service info,
+1 to what Costin said, though I'm not a Consul expert. Consul doesn't support service account identities. K8s apiserver should be the only authority for the secure naming information (at least for the near future). So we can register the service account information for each Consul service on apiserver.
you can check the mesh expansion - it does VM registration in k8s api server which lets the istio CA and the auth node agent fetch the certs for the VM
yes, mTLS works in Mesh expansion mode, but service registration is manually in this case using istioctl.
Idea here is to use consul for service discovery and consul DNS instead of kubedns. So that service registration is automated. This is not kube mesh expansion scenario, its more of running Istio on just VMs.
File a bug to not emit verify subject alt name if it’s null. As for a proper solution, I am afraid you have to wait until the auth folks figure out a way to generate sans in the absence of kubernetes. Merging and duplicating service registration is not a viable solution imo, as it adds triple overhead of maintaining etcd cluster, kube api server and consul with more points of failure. Manual or scripted registration as you say is unreliable. And any registration solution we build will never be as reliablr and battle tested as consul. We have also added file based config adapter eliminating the need to run etcd/api server for simple installations. I think it’s a pretty simple solution of running a root ca and exposing kube style apis to generate service account information that can then be stored as kv pairs in Consul itself. @myidpt why isn’t this possible? It’s the least amount of software that people have to install and manage. Just one combined istio binary and consul server/agent. All we need is to generate the root ca which users can do themselves. You then only need to sign cas based on the spiffee format. Do we need an entire kube api server for this?
Putting null as subject_alt_name is a bug, I think we should at least always disable mTLS for Consul services, regardless whether mTLS is enabled globally. @diemtvu I think this can be made possible by the per-service-port mTLS enablement feature. Also, we should document that mTLS is always disabled for Consul services in istio.io.
Beyond that, for Consul service identity provision, we need figured out solutions for two cases:
@rshriram Having Istio CA to offer an API for registering service accounts is like rewriting an identity management system, right? How do we manage the authn/authz on the CA API? I think it could be a solution if CA does not need to manage the authn/authz (instead done by the platform), otherwise, it's rather comlex. Leveraging apiserver to do this frees our hands. IMO, Istio could support plugging in other existing identity systems, if K8s apiserver is not preferred. Does this make sense?
On Tue, Nov 14, 2017 at 3:21 AM Oliver Liu notifications@github.com wrote:
Putting null as subject_alt_name is a bug, I think we should at least always disable mTLS for Consul services, regardless whether mTLS is enabled globally. @diemtvu https://github.com/diemtvu I think this can be made possible by the per-service-port mTLS enablement feature. Also, we should document that mTLS is always disabled for Consul services in istio.io.
Beyond that, for Consul service identity provision, we need figured out solutions for two cases:
- K8s apiserver exists in the Istio cluster
- K8s apiserver doesn't exist in the Istio cluster (Correct me if I'm wrong), 1) is what we are assuming now and in the near future. But unfortunately this issue fails into case 2). In this case, I think it's worth discussion how to support 2) if this is a strong requirement.
@rshriram https://github.com/rshriram Having Istio CA to offer an API for registering service accounts is like rewriting an identity management system, right? How do we manage the authn/authz on the CA API? I think it could be a solution if CA does not need to manage the authn/authz (instead done by the platform), otherwise, it's rather comlex. Leveraging apiserver to do this gives us more benefit. IMO, Istio could support plugging in other existing identity systems, if K8s apiserver is preferred. Does this make sense?
What’s missing in scenario 1)? This issue is one where the api server exists but no effort has been made to make the node agent work.
—
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/118#issuecomment-344070515, or mute the thread https://github.com/notifications/unsubscribe-auth/AH0qdyuIsmXEmrgI4sCBXS-t_WrE7RI9ks5s2Ln0gaJpZM4QRrbA .
Talked with @costinm offline. Just got to know that you are running apiserver as the "registry service" here. I think from what Costin said, you need to register all the services on apiserver and all the endpoints on Consul, and Pilot gets both pieces of data from them to generate the Envoy config.
In this case, what you need to do is the following (we can improve the automation later):
That may be the bug - my understanding was that Pilot will pull from all plugins, so services from both k8s and consul will be returned.
The Service definition must be present in k8s apiserver, including the security annotations - AFAIK Consul doesn't have the equivalent annotation.
On Wed, Nov 15, 2017 at 2:01 PM, Oliver Liu notifications@github.com wrote:
Sorry @kidiyoor https://github.com/kidiyoor I didn't get your request very clear. So you are deploying Consul with K8s Apiserver. Do you want to deploy both Consul services and K8s services at the same time? I don't actually know whether Pilot supports service registration for both K8s and consul at the same time.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/118#issuecomment-344742901, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFI6gqcgPVqeazoReKrpLfV-I4gjDXHks5s2183gaJpZM4QRrbA .
On Thu, Nov 16, 2017 at 6:07 AM Costin Manolache notifications@github.com wrote:
That may be the bug - my understanding was that Pilot will pull from all plugins, so services from both k8s and consul will be returned.
The Service definition must be present in k8s apiserver, including the security annotations - AFAIK Consul doesn't have the equivalent annotation.
Yes but that’s not the issue. The problem is that auth requires mirroring the service registry into the api server in order to generate the certs. In my opinion, this should be a simple Standalone consul-istio-mtls-agent that reads from consul, creates services in api server and generates the certificates. Once we have vault integration, we can throw away the agent if it’s not needed.
On Wed, Nov 15, 2017 at 2:01 PM, Oliver Liu notifications@github.com wrote:
Sorry @kidiyoor https://github.com/kidiyoor I didn't get your request very clear. So you are deploying Consul with K8s Apiserver. Do you want to deploy both Consul services and K8s services at the same time? I don't actually know whether Pilot supports service registration for both K8s and consul at the same time.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/118#issuecomment-344742901, or mute the thread < https://github.com/notifications/unsubscribe-auth/AAFI6gqcgPVqeazoReKrpLfV-I4gjDXHks5s2183gaJpZM4QRrbA
.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/118#issuecomment-344776769, or mute the thread https://github.com/notifications/unsubscribe-auth/AH0qd3UAneeX4M0XYYhbqNm2OdWfkUmSks5s24PegaJpZM4QRrbA .
The problem is that auth requires mirroring the service registry into the api server in order to generate the certs.
If we don't enable auth, the operator still needs to mirror the service registry into the api server, right?
In my opinion, this should be a simple Standalone consul-istio-mtls-agent that reads from consul, creates services in api server and generates the certificates.
I think this is more like "consul-istio-agent", with main task to sync service registrations from consul to apiserver. Secure naming is one more step beyond it. Then, how do we define the identities in Consul? If we create the service accounts 1-1 mapped from service name as the identities, would that be enough?
@myidpt If auth is disabled, the scenario Costin explained works fine - Pilot will pull from services from both k8s and consul (unless there are service name conflicts ?) and it works (No mirroring required here).
The apiserver is the source of config - CRD, routes, service accounts, annotations.
If consul is enabled, it provides endpoint info - but the authoritative info about the service is still in Kubernetes Service.
So Service must be present in the apiserver - and that's what Pilot should use for security info. Even if Vault is available - CRDs, routes, etc will still be in apiserver, and Service info is in the same category of 'core Istio config'
BTW: for hybrid we will likely have a 'syncer' that can pull service info from (multiple) sources, including consul servers - and combine it with other istio config.
I don't think we it's a priority to have a consul-istio-agent instead of the generic solution - but if someone wants to do it it may help.
@costinm : Thanks
Is it fair to say,
For VM only setup (with standalone k8s api server), if mTLS needs to be enabled for the cluster, and if consul is used for endpoint discovery/registration we will need to manually do service registration at both places k8s api server and consul, till the time a generic-syncer solution is available ?
Simply put, "for VM only (with standalone k8s api server) setup mTLS cannot be enabled if service is not registered with k8s api server"
Yes - but to avoid confusion, the Service object must be registered with Istio - not the endpoints. The normal 'mesh expansion' solution for Istio 2.0 requires registering the endpoints as well.
On Thu, Nov 16, 2017 at 12:31 PM, kaizen2017 notifications@github.com wrote:
@costinm https://github.com/costinm : Thanks
Is it fair to say,
- For VM only setup (with standalone k8s api server), if mTLS needs to be enabled for the cluster, and if consul is used for endpoint discovery/registration we will need to manually do service registration at both places k8s api server and consul, till the time a generic-syncer solution is available ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/118#issuecomment-345053454, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFI6kAzyEUUZFF-wl_tJmXm142uhhpVks5s3Ju6gaJpZM4QRrbA .
I disagree. We use the api server for just config storage. Not as a service registry. People using consul use consul service catalog as the authoritative source of info. The fact that we are using the k8s api server is an implementation detail. We have even begun using file based backends for storing configs.
The cloudfoundry folks are not going to be using the api server either. I am trying to avoid the situation where we appear to be forcing kubernetes on everyone even if they have clearly made a decision to not do so (esp since we claim cross platform).
If I don’t want auth, all I need is pilot and consul/eureka/cf. And I am very reluctant to grant write permissions to pilot especially in large deployments. I would much rather keep this code isolated and strictly monitored, like the sync agent or the node ca agent. From a separation of concerns perspective, it seems unclean to overload the routing engine with service registration.
On Fri, Nov 17, 2017 at 2:44 AM Costin Manolache notifications@github.com wrote:
Yes - but to avoid confusion, the Service object must be registered with Istio - not the endpoints. The normal 'mesh expansion' solution for Istio 2.0 requires registering the endpoints as well.
On Thu, Nov 16, 2017 at 12:31 PM, kaizen2017 notifications@github.com wrote:
@costinm https://github.com/costinm : Thanks
Is it fair to say,
- For VM only setup (with standalone k8s api server), if mTLS needs to be enabled for the cluster, and if consul is used for endpoint discovery/registration we will need to manually do service registration at both places k8s api server and consul, till the time a generic-syncer solution is available ?
Just write a simple shell script that dumps services from consul service catalog and creates services of same name in the api server. A more robust option is to write an agent that watches consul and creates the equivalent in the api server.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/118#issuecomment-345053454, or mute the thread < https://github.com/notifications/unsubscribe-auth/AAFI6kAzyEUUZFF-wl_tJmXm142uhhpVks5s3Ju6gaJpZM4QRrbA
.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/118#issuecomment-345064627, or mute the thread https://github.com/notifications/unsubscribe-auth/AH0qd5q7o374ZKYjf4traFpGjIJo_24Yks5s3KXNgaJpZM4QRrbA .
I chatted with @costinm a bit about this. Here is what I think should happen as well as some general guidance for how to mentally model these issues going forward.
Consul is a service registry and should act as such. If a customer wants secure naming for endpoints that are only registered in Consul then the necessary authority information must either be (a) stored in Consul in a reliable way or (b) stored by Istio in a reliable way that maps to the endpoint.
Istio is not a service registry, it consumes many of them, K8S, Consul etc. Istio declares configuration that refers to services declared by service registries and asks that service & endpoint registries assert specific pieces of information in a secure way.
By virtue of shipping with the K8S API server Istio ships a service registry in-the-box so to speak but people are not required to use it and we could in theory even turn that off at some point.
The Consul plugin for Pilot SHOULD support secure naming by returning the authority associated with a registered endpoint (workload). How that is mapped out of Consul (or a combination of Consul and Vault) I don't have a strong opinion about. @kidiyoor can you iterate with @myidpt to figure out how to do this - I suggest taking a look at how this works in Nomad or https://github.com/Verizon/nelson
@myidpt If a service registry cannot reliably assert the detailed authority of a workload then the security of mTLS is degraded but still useful. E.g. We could for instance assert that all endpoints registered in Consul belong to a common root of trust and validate that in the client. We don't get full value of Istio Auth from this but we get a non-trivial amount that would be valuable as part of an incremental migration strategy.
@rshriram I agree that we should allow the Istio API to be backed by different stores. I do think we will need the Istio API to be available to augment service metadata when a service registry cannot be made to meet a requirement. I don't think this is the case here.
I think everyone is pretty much agreeing on almost everything - the apiserver is an implementation detail, we do want to support and integrate with as many environments as possible, and we want to use the native configurations and standards - where available. And I would love to see a pure file-system environment.
My point was that as of no ( Istio 0.2 / 0.3 and what is on current roadmap for 0.4 ) - running an apiserver is the safest option. It doesn't even have to be k8s apiserver - it can be a REST server exposing the same URL and objects (mostly CRDs) - we're just relying on the auth and basic REST concepts to load our configs.
There are very few objects we adopt and extend from K8S - like Service, Service Account, Secret - if equivalent concepts exist in Consul (or other env) we can translate, but this code is not yet available, and what is currently stable and tested is having the k8s objects and the apiserver.
As Shriram mentioned, a simple script or tool to import is not hard - and we can also have some 'default' account so at least we never get 'null'.
Integrating with Vault will be very nice and improve a lot of things - the current provisioning of certificates to VM is far from ideal, and access to private keys is very sensitive, I'm pretty uncomfortable with reading the keys from k8s and copying them to the VM, but that's what we have now.
I talked with @louiscryan offline. We will try to solve two things:
sure @myidpt . let's catch up offline
On Sat, Nov 18, 2017 at 12:42 AM Louis Ryan notifications@github.com wrote:
I chatted with @costinm https://github.com/costinm a bit about this. Here is what I think should happen as well as some general guidance for how to mentally model these issues going forward.
Consul is a service registry and should act as such. If a customer wants secure naming for endpoints that are only registered in Consul then the necessary authority information must either be (a) stored in Consul in a reliable way or (b) stored by Istio in a reliable way that maps to the endpoint.
Istio is not a service registry, it consumes many of them, K8S, Consul etc. Istio declares configuration that refers to services declared by service registries and asks that service & endpoint registries assert specific pieces of information in a secure way.
By virtue of shipping with the K8S API server Istio ships a service registry in-the-box so to speak but people are not required to use it and we could in theory even turn that off at some point.
The problem here or in other places is the dual headache of supporting two service registries (etcd k8s and Consul). Of all the objections, this is the one I am most concerned about. Managing a consul cluster is pain on its own but if an organization has accumulated experience and expertise in doing so, it’s best to piggyback on that instead of imposing yet another distributed data store.
The Consul plugin for Pilot SHOULD support secure naming by returning the
authority associated with a registered endpoint (workload). How that is mapped out of Consul (or a combination of Consul and Vault) I don't have a strong opinion about. @kidiyoor https://github.com/kidiyoor can you iterate with @myidpt https://github.com/myidpt to figure out how to do this - I suggest talking a look at how this works in Nomad (if at all)
@myidpt https://github.com/myidpt If a service registry cannot reliably assert the detailed authority of a workload then the security of mTLS is degraded but still useful. E.g. We could for instance assert that all endpoints registered in Consul belong to a common root of trust and validate that in the client. We don't get full value of Istio Auth from this but we get a non-trivial amount that would be valuable as part of an incremental migration strategy.
@rshriram https://github.com/rshriram I agree that we should allow the Istio API to be backed by different stores. I do think we will need the Istio API to be available to augment service metadata when a service registry cannot be made to meet a requirement. I don't think this is the case here.
I am in full agreement on this. We would need this in eureka plus VMs or docker swarm etc.
My point is that our api server’s services do not fully align with our service model. So it kind of feels like double transformation. If we were creating CRDs, that’s a different thing.
Secondly, I am trying to keep the readers and writers separate for scaling, management and security purposes.
—
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/118#issuecomment-345337870, or mute the thread https://github.com/notifications/unsubscribe-auth/AH0qd7XsfyIn3dCBaITGZ6SlaTygdn4qks5s3dqjgaJpZM4QRrbA .
I have worked with kidiyoor@ to draft the design doc: https://goo.gl/Dt11Ct. Please take a look and comment. Thanks!
Here is my setup (non kubernetes setup) -
VM-1 (ubuntu:14.04) - etcd2 | apiserver (from Kube 1.7.3) | pilot (master branch) | istio CA(master branch) | consul VM-2 (ubuntu:14.04) - node_agent | envoy(init by pilot-agent) | consul agent | simple_http_service_1 VM-3 (ubuntu:14.04) - node_agent | envoy(init by pilot-agent) | consul agent | simple_http_service_2
I noticed - the key rotation works fine.
When I enable mTLS from mesh config -
authPolicy: MUTUAL_TLS
and restart pilot - I see the following error thrown by envoy[2017-11-03 20:22:44.068][5005][warning][upstream] external/envoy/source/common/upstream/cds_subscription.cc:65] cd s: fetch failure: JSON at lines 3-15 does not conform to schema. Invalid schema: #/definitions/ssl_context/properties/verify_subject_alt_name Schema violation: type Offending document key: #/ssl_context/verify_subject_alt_name
Any help would be appreciated :)