Closed dgrandemange closed 3 years ago
It wouldn't be hard to change that and/or make it an option.
ping @iocanel @salaboy and @geoand
Ok, Depending on the time i can put on this, i 'll consider give a try, and come back to you.
@dgrandemange you can do the change.. but in reality we are using the serviceId in order to do requests between services... In my demo, I've used a configuration for ribbon to not load balance from the client side.. so instead of using the prefix: "lb://" I use "https://". The Kubernetes service will do the load balance for you. You can still use the fallbacks.. but client side load balancing will be handled by Kubernetes
Well, i am already using serviceId and therefore rely on OpenShift service to handle the load-balancing thing.
Consider REST service interface below :
public interface TpMinishiftService {
@RequestMapping("/podinfos")
PodInfo podinfos();
}
My service B provides an implementation of this service, whereas service A client uses FeignClient below to invoke it, specifying a serviceId, which is here specified using the FeignClient.name
annotation attribute (using a property placeholder here) :
@FeignClient(name = "${service.tp-minishift.name}", fallback = TpMinishiftServiceProxyFallback.class)
public interface TpMinishiftServiceProxy extends TpMinishiftService {
@Component
public static class TpMinishiftServiceProxyFallback implements TpMinishiftServiceProxy {
private static final Logger LOGGER = LoggerFactory.getLogger(TpMinishiftServiceProxyFallback.class);
@Override
public PodInfo podinfos() {
throw new UnsupportedOperationException("Invalid fallback method execution");
}
public PodInfo podinfos(Throwable t) {
LOGGER.warn(t.getMessage(), t);
return new PodInfo("<unavailable>");
}
}
}
FeignClient is then autowired into/invoked from Service A controller, like this :
@Controller
public class MainController {
@Autowired
private TpMinishiftServiceProxy tpMinishiftCli;
@RequestMapping("/get-podinfos")
public String getPodinfos(Model model) {
PodInfo podinfos = tpMinishiftCli.podinfos();
model.addAttribute("podinfos", podinfos);
// ...
return "podinfos-view";
}
}
@dgrandemange I faced similar problem and it all boils down to
@FeignClient(name = "${service.tp-minishift.name}" ..)
If you do @FeignClient(name = "http://${service.tp-minishift.name}" ... ) does that work? If not.. we need to look into how the Feign client is created with Ribbon.
I see no reason why we can't provide a configuration option so that KubernetesServiceInstance
will return the hostname instead of the ip when getHost
is called
@salaboy FYI : prefixing serviceId with http://
hasn't fix it
Just for the record, performing a test with host verification disabled (following this recipee ), SSL handshake succeeds, and service A - service B inter-communication is ok.
Ok, I have deployed my service A with a patched version of org.springframework.cloud.kubernetes.ribbon.KubernetesServerList
, replacing EndPoint.getIp()
by EndPoint.getHostname()
when creating Server
instances.
After invoking exposed service A, when service A looks for service B, logs show a servers list with null
for servers hostname.
Actually, according to Openshift documentation, endpoints in OpenShift are only described by IP address, not by hostname.
$ oc describe ep/tp-minishift
Name: tp-minishift
Namespace: poc2
Labels: app=tp-minishift
Annotations: <none>
Subsets:
Addresses: 172.17.0.4,172.17.0.5
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
8443-tcp 8443 TCP
Events: <none>
So, this is not the way to go.
Now looking at @geoand comment, i realize i should have try to patch org.springframework.cloud.kubernetes.discovery.KubernetesServiceInstance
instead.
@dgrandemange please let us know how it goes if you make a test with a patched KubernetesServiceInstance
@geoand I am not sure that we should do that (replace IP for host name completely).. if we do that then we lost the IP information of the service at registration time.
@salaboy I'm not sure I follow :)
Just for you to know : i am trying another approach.
In my FeignClient, instead of specifying a serviceId, i will try to specify the url to the targeted OpenShift service.
Actually, in Openshift, a service is also adressable using its internal hostname as assigned by OpenShift internal DNS.
For instance, my service tp-minishift
in namespace/project poc2
has been assigned the name tp-minishift.poc2.svc.cluster.local
and should be joignable by any other service through this name.
So, in my client service, i should be able to configure Feign client like this :
@FeignClient(url = "https://tp-minishift.poc2.svc.cluster.local")
and should still benefit service native load-balancing between service's backing pods.
Cool thing is, using here a hostname (tp-minishift.poc2.svc.cluster.local
) should now make the handshake server name validation possible.
@dgrandemange I found my configuration that looked like this to avoid Ribbon Client Side LoadBalancing
spring:
application.name: gateway
cloud:
gateway:
discovery:
locator:
enabled: true
url-expression: "'http://'+serviceId"
x-forwarded:
prefixEnabled: true
kubernetes:
reload:
enabled: true
mode: polling
period: 5000
@salaboy What I am referring to is to have the option to use the hostname (of the pod) instead of the ip (again of the pod) in KubernetesServiceInstance
.
That way, if the pod hostnames match the certificates, then https connections should be possible without trouble.
In this case Ribbon will use proper client side load balancing instead of relying on the Kubernetes Service
abstraction to do that
@geoand thanks for the clarification.. that makes more sense, but it still a hack that might lead to problems in the future. The more I think about the problem, I think that to be on the safe side, we should just add a single Endpoint with the Kubernetes Service name. If you need certificates we should recommend Istio, which will also have problems if ribbon try to establish connections with pod host names
@salaboy actually I think we could add that as an option, but I don't think it should be the default. It should be up to the user to choose whether or not they want real client side load balancing (in the sense that it's done by the library of the application), or pseudo client side load balancing which is what would happen if we simply use the (single) service name.
@geoand so maybe what we are talking here is to extend the Endpoint object to contain more information? In any case, I do agree with allowing people to set it up. So it sounds like a good first step to make the configuration of the Endpoints information configurable.
:+1:
I didn't mention in the first place, but when i choosed to use service URL in FeignClient, it indeed meant giving up on Ribbon load-balancing capacity (app level configurable), and only rely on OpenShift's service load-balancing (OpenShift service level).
@geoand @salaboy so, yes, i think you evntually manage to get to the real point here, thank you
We are facing the same issue with TLS Hostname verfication as described in the first post.
It would be great if spring-cloud-kubernetes-discovery could be (optionally) configured that KubernetesServerList
and KubernetesServiceInstance
us the Pod Hostname assigned by Kubernetes:
<pod-ip-dashed>.<namespace>.pod.cluster.local
See also DNS for Services and Pods:
We could sucessfully test the TLS Hostname verification with a TLS Certificate issued to *.<namespace>.pod.cluster.local
and a patched KubernetesServerList
which calulates the Hostname with this code :
address.getIp().replace('.', '-') + "." + namespace + ".pod.cluster.local";
The behaviour should be configureable with a property and the default should still use the IP Adress (similar to preferIpAddress
in Eureka).
I work with a service that uses a flavor of Curator leader election, which works by associating (and persisting) job IDs to server IPs. We use a StatefulSet to ensure local state (persistent volume) is always associated with the same Pod. This works fine, except that Pod IPs change, and when this happens incoming requests to jobs (getStatus etc.) fail. If the DiscoveryClient could instead use the full hostname of a Pod, then the problem would be solved. Happy to submit a PR for this.
Before using Spring Cloud Kubernetes we were using Eureka + Netflix Ribbon for service discovery and client side load balance. The worst part on that stack was the ~30 sec delay for the ribbon to update it's reference for new or removed pods in our cluster. For example, when a pod is updated and gets to the ready status, Kubernetes automatically starts to terminate the old pod. In this situation our microservices will try to call the old pod for around 30 seconds before knowing there's a new pod in town, due the Ribbon + Eureka refresh delay.
When we changed to Spring Cloud Kubernetes (at the time with kubernetes ribbon support), we were able to achieve zero downtime during pod updates, because we set spring.cloud.kubernetes.ribbon.mode
to SERVICE
, which allowed us to use Kubernetes native service discovery.
After we updated to Hoxton.SR5 we lost Kubernetes native service discovery, because as reported by this thread, KubernetesDiscoveryClient
works with endpoints (pods IPs), and not with FQDN anymore. The documentation explain how to use FQDN, but I don't think that will work without service mesh and/or traditional Feign clients with service id.
In order to test how fast KubernetesDiscoveryClient
was, we scaled a pod from 1 to 2, and pod 2 started to receive requests after a little while (~5 seconds) after it were ready. This is acceptable in our application because there's no downtime here. The problem occurred when we scaled back from 2 to 1. When pod 2 was terminated, 1 of our requests took around 6 seconds before failing because the pod was unreachable. This wouldn't happen if Feign used the FQDN to send requests to that pod.
@spencergibb is there a way to use Spring Cloud Kubernetes with FQDN using traditional Feign clients with serviceId, and without relying on istio or other service meshes?
Best regards, Rafael Pacheco.
I believe @OlgaMaciaszek is working on spring cloud loadbalancer support which should have a similar mode to the ribbon integration.
If you don't have an issue with building a custom RouteDefinitionLocator
(which could be more efficient) and need a solution now your could use something like this:
https://gist.github.com/AndreasKl/7d18ce6807c366af6c97c62d126170e1
our config is:
spring.cloud.gateway.discovery.locator.url-expression: "'http://'+serviceId+':'+port"
as our public services are in the default namespace we can use this simple expression. We also disabled (and rolled our own) KubernetesCatalogWatch
as this was not sufficient for our use case.
If you don't have an issue with building a custom
RouteDefinitionLocator
(which could be more efficient) and need a solution now your could use something like this:https://gist.github.com/AndreasKl/7d18ce6807c366af6c97c62d126170e1
our config is:
spring.cloud.gateway.discovery.locator.url-expression: "'http://'+serviceId+':'+port"
as our public services are in the default namespace we can use this simple expression. We also disabled (and rolled our own)
KubernetesCatalogWatch
as this was not sufficient for our use case.
I'd liked your solution and it could work for me too. But in my use case I have one more variable: the version for each service.
Each application can have multiple versions on the same environment. Each version has its own kubernetes service with a different name.
Example:
Environment: staging Application: auth
The
auth
application on staging environment will have a service namedauth-staging
.If I'm developing some new version of this application and don't want to change the staging state I will deploy it with a specific name, for example
auth-itsm-323
(which is the jira issue for this feature). It helps our team to control the environment and to test new features individually before sending it to staging as the default version.
Today I use consul for service discovery where each pod register itself with a label specifying its kubernetes service FQDN (we are not using kubernetes service discovery right now).
With your solution I can move my service discovery to kubernetes to automatically discover the FQDN, removing the need of a label on each pod and throw consul away. I will also need to create new labels because I have many namespaces and it should be used in the FQDN. It would work, nice!
The problem is with the feign clients. How to enable the same FeignClient to use different serviceid for the same microservice? I can dynamically configure the serviceId based on a spring property called for example foo.bar.auth.version=itsm-323
?
Sorry if I'm missunderstanding something here. I'm trying different solutions for my use case because I'm using the discovery service only in the gateway. All internal communication between the microservices (using feign) are based on the kubernetes service, which I declare manually using a spring property in our config server.
Example:
FeignClient
@FeignClient(
contextId = "ConnectionApi",
name = "connection-service",
url = "${foo.bar.connection-service-host:\"\"}")
public interface ConnectionApi {
}
ConfigServer:
{
"foo.bar.connection-service-host" : "http://connection-service-staging.namespace.svc:PORT",
}
If I need a specific application to use a different version of the connection-service, I just change it on its configuration. As we scale on number of services (today we already have more than 30 services) this is manually exhausting and doesn't have sense because tecnically we have the service discovery to do the job for us.
What can I do to use FeignClients using the FQDN and have dynamic configurations?
Thank you.
There appears to have been a mixup, Ribbon should have never been removed from the Hoxton Release Train. I will add it back and we will get a release out ASAP. Sorry for the mixup!
Both Ribbon (up to Hoxton) and SC LoadBalancer are integrated with SC Kubernetes now. Closing the issue.
In
org.springframework.cloud.kubernetes.ribbon.KubernetesServerList
, when populating Server list, is there a particular reason to create Server instances using EndpointAddress ip address, instead of EndpointAddress hostname ?Why such question : Well, in an OpenShift cluster, let's say I have two OpenShift services :
Each service is load-balancing to 1 or more of its backing pods. Service A is the ingress service, and is exposed to the outside world via a dedicated route. Service A internally communicates with service B (through OpenFeign/Ribbon client along Hystrix). Communication between the outside world and service A can be secured through a TLS termination configured at OpenShift service level, but it's not the point here.
Now, let's say, i would like to provide a secure TLS communication between Service A and Service B (i am talking about cluster internal communication here). To do so (and without speaking about solutions like Istio or other Service Mesh), i have updated Service B Spring Boot app config so that the REST service in service B is now served through TLS on a 8443 port, and presents a server certificate signed by some CA. Now, when my Spring Boot webapp in service A needs to communicate with the now secured REST service in service B, a SSL handshake occurs in between. During this handshake, standard checks occur, namely server certificate name validation : the server certificate name (CN) should match the hostname used to connect to Pod B. That's where i get into trouble : in order for my webapp (Pod/Service A) to check certificate received from REST service (Pod/Service B), webapp should first have connected to Pod B using a FQDN, not by its internal IP. I mean : in Openshift cluster, Pod's IP address are dynamically allocated. Certificated CN validation can not be done against dynamic IP. In my Certificate Signing Request, i can specify a CN like '
*.<namespace>.endpoints.cluster.local
' so that any hostname matching '<namespace>.endpoints.cluster.local
' domain will be validated at handshake. But, in my case, it would require the Kubernetes Ribbon to populate its server list with hostname, not ip adress.Having server list populated with ip address ...
SSL handshake error below occurs :