fabric8io / kubeflix

Kubernetes integration with Netflix OSS
266 stars 69 forks source link

improvement to the pod selection logic #78

Closed raffaelespazzoli closed 8 years ago

raffaelespazzoli commented 8 years ago

Hi, currently it is assumed that the pods that are hystrix stream sources will have the hystrix.enabled label. This may impose unnecessary constraints on the source pods. There could also be situations where multiple turbine serves need to be run in the same kubernetes cluster and there would be no way to specify which pods each turbine server should connect to. I'd like to propose the following improvements:

  1. ability to specify a selector (as in Kubernetes' pod selector) to select the hystrix source pods. We can still default to hystrix.enabled if no selector is specified. A selector can match on multiple labels.
  2. ability to specify an hystrix cluster for the defined selector. If no cluster is specified that the cluster name could be the pod name (probably more discussion needed here).
  3. ability to specify multiple selectors. With this multiple groups of pods would be selected and would report to the same turbine server.
  4. ability to specify a different URL suffix for each selector.

I should be able to provide 1 and 2 based on my previous project. 3 is not urgent for me and my current endeavor but I think it would be useful in the long term 4 goes together with 3 but I'm afraid there is no easy way to do it unless upgrading to turbine 2.x, which wasn't very mature last time I looked at it.

iocanel commented 8 years ago

Here are some thoughts:

I totally agree that we need more fine grained control. I would rather we avoided using full blown selector for matching and just focused on hystrix.cluster label. Mostly because of the way turbine is configured.

So what I have in mind is something like:

  1. out of the box matching is done using hysterix.enabled and the matching is namespace bound (in the current namespace).
  2. a pod may optionally use hysterix.cluster as a label if more narrow control is required.

From the server side of things:

  1. The server is using ootb the current namespace.
  2. If the above is not desired one or more namespaces can be explicitly specified via configuration.
  3. One or more hystrix.cluster can be specified.

What I describe is already in the master, so you can try it out and let me know what you think. Its pretty much what you describe but instead of selector we leverage the hysterix.cluster

raffaelespazzoli commented 8 years ago

@iocanel, I see your point. I think the two approaches are essentially equivalent in terms of fine tuning the pod selection. The only difference I'd point out is that the approach you describe involves labeling the hystrix source pods. While the approach I described only involves configuring the turbine pod and therefore can be done after the fact.

iocanel commented 8 years ago

@raffaelespazzoli: Interesting point! I like what you propose better! Let me see how we could configure turbine to do something like that.

iocanel commented 8 years ago

@raffaelespazzoli: Some additional thoughts.

Turbine uses the notion of a cluster which is identified by a string value. This is not some optional value, but a value that needs to be provided by the discovery mechanism.

So if we follow your suggestion, we would need to specify a "selector" and also a "name". But wouldn't that be the same with a Kubernetes Service? So maybe we should just specify the name of a Kubernetes Service and use its Endpoints for creating instances.

@jstrachan, @christian-posta: wdyt?

raffaelespazzoli commented 8 years ago

@iocanel summarizing you are proposing the following mapping of concepts between kubernetes and turbine:

  1. kubernetes service pod selector <-> turbine pod selector
  2. kubernetes service name <-> turbine cluster name

I think this is very nice, I'd go for it. It think we should allow to reference services that are outside the namespace turbine is running in.

Now the next question is: should we allow turbine to aggregate data from more than one service? This is essentially my initial point number 3. I think this should be supported.

Thanks Raffaele

iocanel commented 8 years ago

Turbine can handle more than one clusters, but each cluster is aggregated individually. So we should be able to specify multiple services and let turbine do what it does when multiple clusters are configured.

raffaelespazzoli commented 8 years ago

@iocanel could you post an example of how that configuration would look like?

iocanel commented 8 years ago

So, now that we support multiple ways of configuring turbine, we could either specify the config via env vars or configmap.

Using env vars we would just need to specify TURBINE_AGGREGATOR_CLUSTERCONFIG=service1, service2, service3.

or using ConfigMap:

kind: ConfigMap
apiVersion: v1
metadata:
  name: turbine-server
data:
  application.yml: |
    turbine.instanceUrlSuffix: 8080/hystrix.stream
    turbine.aggregator.clusterConfig: service1, service2, service3
iocanel commented 8 years ago

Everything should be in place now.

I also added a small spock test that should make it easy to understand how it works and what's going on under the hood: https://github.com/fabric8io/kubeflix/blob/master/turbine-discovery/src/test/groovy/io/fabric8/kubeflix/turbine/TurbineDiscoveryTest.groovy

raffaelespazzoli commented 8 years ago

thanks @iocanel for these new features. I'm trying them and having some issues. Here is how I configured my turbine-server pod 1.0.16:

        - name: TURBINE_INSTANCEURLSUFFIX
          value: ":8080/mbl/hystrix.stream"
        - name: TURBINE_AGGREGATOR_CLUSTERCONFIG
          value: mbl 

where mbl is the name of the kubernetes service that controls the hystrix enabled pods. These pods expose the hystrix stream at :8080/mbl/hystrix.stream. Notice the colon, it's not there in your doc, but I think it's needed. Also notice that I'm using TURBINE_INSTANCEURLSUFFIX and not TURBINE_INSTANCE_URL_SUFFIX as the documentation suggest, I assume that the former is how spring expects it. I tryied both with no result.

The tubine server returns zero hosts found:

2016-05-06 13:51:21.687  INFO [turbine-server,,,] 1 --- [        Timer-0] c.n.t.discovery.InstanceObservable       : Retrieved hosts from InstanceDiscovery: 0
199 2016-05-06 13:51:21.688  INFO [turbine-server,,,] 1 --- [        Timer-0] c.n.t.discovery.InstanceObservable       : Found hosts that have been previously terminated: 0
200 2016-05-06 13:51:21.688  INFO [turbine-server,,,] 1 --- [        Timer-0] c.n.t.discovery.InstanceObservable       : Hosts up:0, hosts down: 0

I also labeled my pods with hystrix.enabled=true, but with no apparent result. Is there anything I'm doing wrong, how can I solve the problem?

Thanks Raffaele

iocanel commented 8 years ago

@raffaelespazzoli:

regarding the the missing coln, you are right. it should be there. you are also right the the spring way of doing things would be using TURBINE_INSTANCEURLSUFFIX

I'll update the docs, check again and add even more tests.

raffaelespazzoli commented 8 years ago

will TURBINE_INSTANCEURLSUFFIX need some special handling? How does spring know where to put the uppercase letters to get to what turbine expects: turbine.instanceUrlSuffix

thanks Raffaele

iocanel commented 8 years ago

no special handling is needed I think that spring under the hood just converts to uppercase and replaces dots with underscores.

raffaelespazzoli commented 8 years ago

@iocanel I'm looking at the turbine discovery logic. I see logic to select pods with hystrix.enabled=true or pods that have a label hystrix.cluster if the turbine cluster property has been initialized. I thought the design was that if the turbine cluster property is initialized those cluster names would be interpreted as kubernetes services names and turbine would discover all the pods selected by those services. With the approach we have now it is still necessary to modify the labels of the pods that are hystrix source.

Anyway I'm now just trying to use hystrix.enabled as the mechanism to discovery the pods and turbine always return an empty set.

Is there anyway we could get on a call and take a look at this issue together? I work at KeyBank and we are one of the first OpenShift customers.

Thanks Raffaele

iocanel commented 8 years ago

@raffaelespazzoli:

if you check at https://github.com/fabric8io/kubeflix/blob/e39c223a46efe82013b244b30b7a769392a4e553/turbine-discovery/src/main/java/io/fabric8/kubeflix/turbine/TurbineDiscovery.java#L44 you'll see that if one or more clusters have been specified then it looks for endpoints named after the cluster. if not it searches for hystrix.enabled endpoints in the specified namespace. Note that hystrix.enabled should only be required in the later case.

raffaelespazzoli commented 8 years ago

@iocanel

I can't find a good javadoc for the kubernetes client api, what is an "endpoint" in that api? Pods or Services?

Based on our conversations my expectation on the behavior is the following:

  1. if turbine.aggregator.namespaceConfig is set look in those namespaces, otherwise look in the current namespace.
  2. if turbine.aggregator.clusterConfig is set, then consider those names services. Look at the selector of each of those services, select the same pods and aggregate them in turbine clusters each based on the service of provenance. this way kubernetes service name = turbine cluster name. This is the use case we it is not necessary add any label on the observed pods.
  3. if turbine.aggregator.clusterConfig is not set select those pods that have hystrix.enabled. Assign them the cluster name equal to value of the label 'hystrix.cluster` or default cluster if the label is absent.
  4. append turbine.instanceUrlSuffix for the full hystrix stream url.

I am not familiar with lambda style programming, so I'm not sure what is happening in the code. Also, I set the log at DEBUG, but the code doesn't log anything. I'd like to see what query is being issued against the kubernetes api to get the pods.

I'll try to ask again, is there a way we can work on this together? Should I have this request come from our RedHat contacts at KeyBank?

Thanks Raffaele

iocanel commented 8 years ago

@raffaelespazzoli: No need to go through any official channels. If not obvious we welcome collaboration from anyone regardless of the workplace ;-)

Next week I will have limited availability (travelling and meetings) and our best chance is on Monday. I think that we should get online on irc and from there schedule a call.

iocanel commented 8 years ago

I was going through the points form 1 to 4 and I don't like the fact that there is a lot of mapping involved.

In one case we map services to clusters, in the other we map namespaces to clusters. This alone is enough to cause headache and its also somehow limiting (doesn't allow a cluster of multiple services).

I was wondering if it made sense to be able to define exactly what a cluster should contain (combination of services and namespaces).

raffaelespazzoli commented 8 years ago

@iocanel not sure what you mean by irc, I sent you an invite in LinkedIn we can connect from there.

Regarding your comment: in my mind there is only one mapping: kubernetes services to turbine cluster.

Then namespace is a different subject, it is where should the services should be looked for. Namespaces can be used for a lot of different purposes in kubernetes, but in OpenShift they assume the meaning of projects. A project may have multiple microservices clusters and each could be mapped to a turbine cluster. A solution can be deployed across multiple projects, but there could be the requirement to aggregate all the data of a solution in single hystrix dashborard. In conclusion, for me, no mapping should be done with namespaces. Namespaces simply indicate where to look for. In my initial proposal I was suggesting to list fully qualified services, that is .. This to me still makes the most sense. So, it would look something like turbine.aggregator.clusterConfig=cluster1, ns1.cluster2, ns3.cluster3. Notice that the first service is unqualified, it would imply to use the namespace where the turbine pod is currently deployed.

thanks Raffaele

iocanel commented 8 years ago

By irc I meant a chat which could allow a somewhat more direct way of communication than github or linkedin. See: http://fabric8.io/community/

There is the mapping of services to clusters, but if no services are specified the namespace is used as cluster name and that is confusing.

I completely missed your original point on fully qualified and I totally agree that this is the way to go. I think though that using something like turbine.aggregator.clusterConfig=cluster1, ns1.cluster2, ns3.cluster3 doesn't cover for it, as turbine would interpret that as multiple different clusters and not as a single that aggregates multiple fully qualified services.

So I would like to counter propose something like this:

We leave turbine.aggregator.clusterConfig as is (just cluster names and nothing else. We use an additional property to define what fully qualified services are part of which cluster. So it could be something like:

turbine.aggregator.clusterConfig=foo,bar,baz
turbine.clusters.foo=servicefoo, ns1.servicefoo, ns2.servicefoo
turbine.clusters.bar=servicebar
turbine.cluster.baz=ns1.servicebaz
iocanel commented 8 years ago

I added a pull request with those ideas. I will be around for the next couple of hours, so we could talk about that stuff.

raffaelespazzoli commented 8 years ago

ok looks good to me. I'll try it out as soon as the PR hits the master branch.

Thanks Raffaele

raffaelespazzoli commented 8 years ago

@iocanel

A couple of updates, I built project locally from your project and misc branch and deployed in our cluster. I’m using the following configuration for the turbine-serve pod:

     - name: TURBINE_AGGREGATOR_CLUSTERSCONFIG
       value: mbl  
     - name: TURBINE_AGGREGATOR_CLUSTERS_MBL
       value: digitalchannel.mbl 

I see from the log that turbine finds two pods as expected, but I can’t display them in the hystrix console I tried with both /turbine.stream (I get an empty response) and turbine.stream?cluster=mbl (I get cluster not found).

I also found out why the hystrix.enabled configuration had never worked for me. This label must be set on the service not on the pods as I was doing. I think the documentation is unclear on this point. I also am not sure that you made the right design decision here. In fact a pod is really what is hystrix enabled and in theory you can have a pod without a service. Although I admit this would be an edge case.

Thanks Raffaele

raffaelespazzoli commented 8 years ago

@iocanel

a couple of updates:

On the hystrix.enabled approach. The pod selection seems to not be scoped by the namespace in which turbine is running. In fact it returns all the hystrix.enabled services of the OpenShift/Kubernetes cluster. This makes things very difficult/confusing to manage when there are several hystrix enabled projects or the same projects is deployed multiple times for multiple purposes. This design forces to use the other approach for all situations where some type of scoped selection is needed. I think this approach should be scoped by default.

On the advanced pod selection, I made some progress and tested with the following configurations:

        - name: TURBINE_AGGREGATOR_CLUSTERCONFIG
          value: mbl  
        - name: TURBINE_AGGREGATOR_CLUSTER_MBL_SERVICES
          value: mbl 

and

        - name: TURBINE_AGGREGATOR_CLUSTERCONFIG
          value: mbl  
        - name: TURBINE_AGGREGATOR_CLUSTER_MBL_SERVICES
          value: digitalchannel.mbl 

The documentation says that the property to configure a cluster is turbine.aggregator.clusters.<cluster name> but then the environment variable for the same property is TURBINE_AGGREGATOR_CLUSTER_<CLUSTER NAME>_SERVICES this was confusing and took me sometimes to figure out, is there a reason?

Both the above configurations return 4 hosts when they should return only two hosts. I don't know what the reason is. In both cases I can connect to the mbl cluster in the hystrix console, but no data is displayed, even though the nodes are producing data. So turbine seems to be finding the nodes (too many) but it looks as it is not assigning them to the correct turbine cluster. Finally a question, the documentation is unclear on whether configuration number 1 is supported. In this case in the services list I didn't specify the namespace. What is the behavior in that case? My expectation is that the current namespace will be used. This is very important because it allows to write an environment independent yaml file, i.e. the yaml file remains the same as the solution traverses the environments to reach production.

Thanks Raffaele

iocanel commented 8 years ago

@raffaelespazzoli: for scoping of the current namespace to work, we need to tell the client which is the namespace. Usually we do that by adding the namespace as an env var using a field reference.

  - env:
    - name: KUBERNETES_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
iocanel commented 8 years ago

@raffaelespazzoli: So, turbine internally allows and expects each cluster to define its own suffix. So if you define a cluster (e.g. mycluster) turbine accepts to find a property called turbine.instanceUrlSuffix.mycluster that defines the suffix. If the property is missing it will just barf.

I added that property locally in my tests and everything seems to work fine now.

raffaelespazzoli commented 8 years ago

we can close this issue for me.