Closed raffaelespazzoli closed 8 years ago
Here are some thoughts:
I totally agree that we need more fine grained control. I would rather we avoided using full blown selector for matching and just focused on hystrix.cluster
label. Mostly because of the way turbine is configured.
So what I have in mind is something like:
hysterix.enabled
and the matching is namespace bound (in the current namespace).hysterix.cluster
as a label if more narrow control is required. From the server side of things:
hystrix.cluster
can be specified.What I describe is already in the master, so you can try it out and let me know what you think.
Its pretty much what you describe but instead of selector we leverage the hysterix.cluster
@iocanel, I see your point. I think the two approaches are essentially equivalent in terms of fine tuning the pod selection. The only difference I'd point out is that the approach you describe involves labeling the hystrix source pods. While the approach I described only involves configuring the turbine pod and therefore can be done after the fact.
@raffaelespazzoli: Interesting point! I like what you propose better! Let me see how we could configure turbine to do something like that.
@raffaelespazzoli: Some additional thoughts.
Turbine uses the notion of a cluster which is identified by a string value. This is not some optional value, but a value that needs to be provided by the discovery mechanism.
So if we follow your suggestion, we would need to specify a "selector" and also a "name". But wouldn't that be the same with a Kubernetes Service? So maybe we should just specify the name of a Kubernetes Service and use its Endpoints for creating instances.
@jstrachan, @christian-posta: wdyt?
@iocanel summarizing you are proposing the following mapping of concepts between kubernetes and turbine:
I think this is very nice, I'd go for it. It think we should allow to reference services that are outside the namespace turbine is running in.
Now the next question is: should we allow turbine to aggregate data from more than one service? This is essentially my initial point number 3. I think this should be supported.
Thanks Raffaele
Turbine can handle more than one clusters, but each cluster is aggregated individually. So we should be able to specify multiple services and let turbine do what it does when multiple clusters are configured.
@iocanel could you post an example of how that configuration would look like?
So, now that we support multiple ways of configuring turbine, we could either specify the config via env vars or configmap.
Using env vars we would just need to specify TURBINE_AGGREGATOR_CLUSTERCONFIG=service1, service2, service3
.
or using ConfigMap:
kind: ConfigMap
apiVersion: v1
metadata:
name: turbine-server
data:
application.yml: |
turbine.instanceUrlSuffix: 8080/hystrix.stream
turbine.aggregator.clusterConfig: service1, service2, service3
Everything should be in place now.
I also added a small spock test that should make it easy to understand how it works and what's going on under the hood: https://github.com/fabric8io/kubeflix/blob/master/turbine-discovery/src/test/groovy/io/fabric8/kubeflix/turbine/TurbineDiscoveryTest.groovy
thanks @iocanel for these new features. I'm trying them and having some issues. Here is how I configured my turbine-server pod 1.0.16:
- name: TURBINE_INSTANCEURLSUFFIX
value: ":8080/mbl/hystrix.stream"
- name: TURBINE_AGGREGATOR_CLUSTERCONFIG
value: mbl
where mbl
is the name of the kubernetes service that controls the hystrix enabled pods.
These pods expose the hystrix stream at :8080/mbl/hystrix.stream
. Notice the colon, it's not there in your doc, but I think it's needed.
Also notice that I'm using TURBINE_INSTANCEURLSUFFIX and not TURBINE_INSTANCE_URL_SUFFIX as the documentation suggest, I assume that the former is how spring expects it. I tryied both with no result.
The tubine server returns zero hosts found:
2016-05-06 13:51:21.687 INFO [turbine-server,,,] 1 --- [ Timer-0] c.n.t.discovery.InstanceObservable : Retrieved hosts from InstanceDiscovery: 0
199 2016-05-06 13:51:21.688 INFO [turbine-server,,,] 1 --- [ Timer-0] c.n.t.discovery.InstanceObservable : Found hosts that have been previously terminated: 0
200 2016-05-06 13:51:21.688 INFO [turbine-server,,,] 1 --- [ Timer-0] c.n.t.discovery.InstanceObservable : Hosts up:0, hosts down: 0
I also labeled my pods with hystrix.enabled=true
, but with no apparent result.
Is there anything I'm doing wrong, how can I solve the problem?
Thanks Raffaele
@raffaelespazzoli:
regarding the the missing coln, you are right. it should be there.
you are also right the the spring way of doing things would be using TURBINE_INSTANCEURLSUFFIX
I'll update the docs, check again and add even more tests.
will TURBINE_INSTANCEURLSUFFIX
need some special handling? How does spring know where to put the uppercase letters to get to what turbine expects: turbine.instanceUrlSuffix
thanks Raffaele
no special handling is needed I think that spring under the hood just converts to uppercase and replaces dots with underscores.
@iocanel
I'm looking at the turbine discovery logic. I see logic to select pods with hystrix.enabled=true
or pods that have a label hystrix.cluster
if the turbine cluster property has been initialized.
I thought the design was that if the turbine cluster property is initialized those cluster names would be interpreted as kubernetes services names and turbine would discover all the pods selected by those services.
With the approach we have now it is still necessary to modify the labels of the pods that are hystrix source.
Anyway I'm now just trying to use hystrix.enabled
as the mechanism to discovery the pods and turbine always return an empty set.
Is there anyway we could get on a call and take a look at this issue together? I work at KeyBank and we are one of the first OpenShift customers.
Thanks Raffaele
@raffaelespazzoli:
if you check at https://github.com/fabric8io/kubeflix/blob/e39c223a46efe82013b244b30b7a769392a4e553/turbine-discovery/src/main/java/io/fabric8/kubeflix/turbine/TurbineDiscovery.java#L44 you'll see that if one or more clusters have been specified then it looks for endpoints named after the cluster. if not it searches for hystrix.enabled
endpoints in the specified namespace. Note that hystrix.enabled
should only be required in the later case.
@iocanel
I can't find a good javadoc for the kubernetes client api, what is an "endpoint" in that api? Pods or Services?
Based on our conversations my expectation on the behavior is the following:
turbine.aggregator.namespaceConfig
is set look in those namespaces, otherwise look in the current namespace.turbine.aggregator.clusterConfig
is set, then consider those names services. Look at the selector of each of those services, select the same pods and aggregate them in turbine clusters each based on the service of provenance. this way kubernetes service name = turbine cluster name. This is the use case we it is not necessary add any label on the observed pods.turbine.aggregator.clusterConfig
is not set select those pods that have hystrix.enabled
. Assign them the cluster name equal to value of the label 'hystrix.cluster` or default cluster if the label is absent. turbine.instanceUrlSuffix
for the full hystrix stream url.I am not familiar with lambda style programming, so I'm not sure what is happening in the code. Also, I set the log at DEBUG, but the code doesn't log anything. I'd like to see what query is being issued against the kubernetes api to get the pods.
I'll try to ask again, is there a way we can work on this together? Should I have this request come from our RedHat contacts at KeyBank?
Thanks Raffaele
@raffaelespazzoli: No need to go through any official channels. If not obvious we welcome collaboration from anyone regardless of the workplace ;-)
Next week I will have limited availability (travelling and meetings) and our best chance is on Monday. I think that we should get online on irc and from there schedule a call.
I was going through the points form 1 to 4 and I don't like the fact that there is a lot of mapping involved.
In one case we map services to clusters, in the other we map namespaces to clusters. This alone is enough to cause headache and its also somehow limiting (doesn't allow a cluster of multiple services).
I was wondering if it made sense to be able to define exactly what a cluster should contain (combination of services and namespaces).
@iocanel not sure what you mean by irc, I sent you an invite in LinkedIn we can connect from there.
Regarding your comment: in my mind there is only one mapping: kubernetes services to turbine cluster.
Then namespace is a different subject, it is where should the services should be looked for. Namespaces can be used for a lot of different purposes in kubernetes, but in OpenShift they assume the meaning of projects. A project may have multiple microservices clusters and each could be mapped to a turbine cluster. A solution can be deployed across multiple projects, but there could be the requirement to aggregate all the data of a solution in single hystrix dashborard. In conclusion, for me, no mapping should be done with namespaces. Namespaces simply indicate where to look for.
In my initial proposal I was suggesting to list fully qualified services, that is turbine.aggregator.clusterConfig=cluster1, ns1.cluster2, ns3.cluster3
. Notice that the first service is unqualified, it would imply to use the namespace where the turbine pod is currently deployed.
thanks Raffaele
By irc
I meant a chat which could allow a somewhat more direct way of communication than github or linkedin. See: http://fabric8.io/community/
There is the mapping of services to clusters, but if no services are specified the namespace is used as cluster name and that is confusing.
I completely missed your original point on fully qualified
and I totally agree that this is the way to go. I think though that using something like turbine.aggregator.clusterConfig=cluster1, ns1.cluster2, ns3.cluster3
doesn't cover for it, as turbine would interpret that as multiple different clusters and not as a single that aggregates multiple fully qualified services.
So I would like to counter propose something like this:
We leave turbine.aggregator.clusterConfig
as is (just cluster names and nothing else.
We use an additional property to define what fully qualified services are part of which cluster. So it could be something like:
turbine.aggregator.clusterConfig=foo,bar,baz
turbine.clusters.foo=servicefoo, ns1.servicefoo, ns2.servicefoo
turbine.clusters.bar=servicebar
turbine.cluster.baz=ns1.servicebaz
I added a pull request with those ideas. I will be around for the next couple of hours, so we could talk about that stuff.
ok looks good to me. I'll try it out as soon as the PR hits the master branch.
Thanks Raffaele
@iocanel
A couple of updates, I built project locally from your project and misc branch and deployed in our cluster. I’m using the following configuration for the turbine-serve pod:
- name: TURBINE_AGGREGATOR_CLUSTERSCONFIG
value: mbl
- name: TURBINE_AGGREGATOR_CLUSTERS_MBL
value: digitalchannel.mbl
I see from the log that turbine finds two pods as expected, but I can’t display them in the hystrix console I tried with both /turbine.stream (I get an empty response) and turbine.stream?cluster=mbl (I get cluster not found).
I also found out why the hystrix.enabled configuration had never worked for me. This label must be set on the service not on the pods as I was doing. I think the documentation is unclear on this point. I also am not sure that you made the right design decision here. In fact a pod is really what is hystrix enabled and in theory you can have a pod without a service. Although I admit this would be an edge case.
Thanks Raffaele
@iocanel
a couple of updates:
On the hystrix.enabled
approach. The pod selection seems to not be scoped by the namespace in which turbine is running. In fact it returns all the hystrix.enabled
services of the OpenShift/Kubernetes cluster. This makes things very difficult/confusing to manage when there are several hystrix enabled projects or the same projects is deployed multiple times for multiple purposes. This design forces to use the other approach for all situations where some type of scoped selection is needed. I think this approach should be scoped by default.
On the advanced pod selection, I made some progress and tested with the following configurations:
- name: TURBINE_AGGREGATOR_CLUSTERCONFIG
value: mbl
- name: TURBINE_AGGREGATOR_CLUSTER_MBL_SERVICES
value: mbl
and
- name: TURBINE_AGGREGATOR_CLUSTERCONFIG
value: mbl
- name: TURBINE_AGGREGATOR_CLUSTER_MBL_SERVICES
value: digitalchannel.mbl
The documentation says that the property to configure a cluster is turbine.aggregator.clusters.<cluster name>
but then the environment variable for the same property is TURBINE_AGGREGATOR_CLUSTER_<CLUSTER NAME>_SERVICES
this was confusing and took me sometimes to figure out, is there a reason?
Both the above configurations return 4 hosts when they should return only two hosts. I don't know what the reason is. In both cases I can connect to the mbl cluster in the hystrix console, but no data is displayed, even though the nodes are producing data. So turbine seems to be finding the nodes (too many) but it looks as it is not assigning them to the correct turbine cluster. Finally a question, the documentation is unclear on whether configuration number 1 is supported. In this case in the services list I didn't specify the namespace. What is the behavior in that case? My expectation is that the current namespace will be used. This is very important because it allows to write an environment independent yaml file, i.e. the yaml file remains the same as the solution traverses the environments to reach production.
Thanks Raffaele
@raffaelespazzoli: for scoping of the current namespace to work, we need to tell the client which is the namespace. Usually we do that by adding the namespace as an env var using a field reference.
- env:
- name: KUBERNETES_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
@raffaelespazzoli: So, turbine internally allows and expects each cluster to define its own suffix. So if you define a cluster (e.g. mycluster
) turbine accepts to find a property called turbine.instanceUrlSuffix.mycluster
that defines the suffix. If the property is missing it will just barf.
I added that property locally in my tests and everything seems to work fine now.
we can close this issue for me.
Hi, currently it is assumed that the pods that are hystrix stream sources will have the
hystrix.enabled
label. This may impose unnecessary constraints on the source pods. There could also be situations where multiple turbine serves need to be run in the same kubernetes cluster and there would be no way to specify which pods each turbine server should connect to. I'd like to propose the following improvements:hystrix.enabled
if no selector is specified. A selector can match on multiple labels.I should be able to provide 1 and 2 based on my previous project. 3 is not urgent for me and my current endeavor but I think it would be useful in the long term 4 goes together with 3 but I'm afraid there is no easy way to do it unless upgrading to turbine 2.x, which wasn't very mature last time I looked at it.