Closed mash-graz closed 4 years ago
after spending another day on this particular issue, i think, it's mostly related to the question, if gloo is indeed able to use the traditional kubernetes ingress-proxy and it's more advanced gatway-proxy at the same time, because cert-manager unfortunately utilizes only the first mentioned mechanism.
the gloo documentation states, that both handlers can be installed at the same time (e.g. here: "If you want to take advantage of greater routing capabilities of Gloo, you should look at Gloo in gateway mode, which complements Gloo’s Ingress support, i.e., you can use both modes together in a single cluster."), but in practice this doesn't seem to work on my simple k3s setup, which is running in a docker-compose envrionment, using --network=host
and bootstraped only by utilizing simple helm charts and other statitic yaml fragmenets in the manifests
folder.
in real world one of the two proxies always stays in a <pending>
condition, and i can only reach the letsencrypt related secret handled by the ingress-proxy or the main web content published via the gateway-proxy from the outside.
maybe it's somehow caused by other troubles and insufficient setups on my side, which could be very likely concern loadbalancer related requirements.
because i disabled the default (traefik1 based) ingress solution of k3s and want to replace it by gloo, k3s doesn't deploy it's loadbalancer, too. that's in fact an intended behavior, because i could otherwise hardly use the more advanced TCP proxing features of gloo, but maybe it could interfere with other expectations and requirements of gloo.
i would be really happy, if you could tell me some hints, how to overcome this troubles and find a working solution in the context of the described minimalist setup.
if you need any additional information to debug and understand the issue, don't hesitate to ask. i'm really happy if i'm able to cooperate and find a solution, which may be useful for others as well.
thanks!
Thanks @mash-graz for all the detail. Let me ask the team and get back.
i believe you want to install gloo once with the following helm values:
gateway:
enabled: true
ingress:
enabled: true
Which it sounds like you have already done. Now, to figure out why both aren't routeable at once, run glooctl check
for me to track down any erroneous config.
Please share the results of glooctl check
with each proxy pending. Additionally, which version are you on?
sorry for my late reply -- we are working really hard these days to maintain and expand video services for users in the local art scene, but sometimes i simply need a break and little bit of sleep ;)
i believe you want to install gloo once with the following helm values:
gateway: enabled: true ingress: enabled: true
yes -- in fact i use this yaml-fragment for the automatized roll out , which only explicitly sets ingress
, because gateway
is already enabled by default:
---
apiVersion: v1
kind: Namespace
metadata:
name: gloo-system
labels:
name: gloo-system
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: gloo
namespace: kube-system
spec:
chart: gloo
targetNamespace: gloo-system
repo: https://storage.googleapis.com/solo-public-helm
set:
ingress.enabled: "true"
Which it sounds like you have already done. Now, to figure out why both aren't routeable at once, run glooctl check for me to track down any erroneous config.
ms@kaffee:~/ms_kube$ glooctl check Checking deployments... OK Checking pods... Pod svclb-gateway-proxy-jmn6t in namespace gloo-system is not yet scheduled! Message: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. Problems detected!
this message from
glooctl
sound rather plausible, but on the other hand it's more or less inevitable conflict, becausecert-manager
has to claim the same port and IP as the main webserver, otherwise it wouldn't be accepted to answer the challanges resp. validate certificat requests for this particula machine. unfortunately this kind of common helpers often require traditionalIngress
support and can not be modified to utilize the more advanced gloo gateway mechanism.
if you see any chance, how to e.g. chain both modules in a more compatible manner or some other way to work around this flaw, i would be really happy!
btw. i also had to disable the ssl-section on the gateway side, because gloo doesn't accept a requested but not finally validated letsencrypt certificate. that's in fact another very unpleasant behavior, because it's hard to workaround this fundamental hindrance on the way to working letsencrypt based TLS handling without manual intervention by an automatized rollout.
ms@kaffee:~/ms_kube$ glooctl get proxy
+---------------+-----------+---------------+----------+
| PROXY | LISTENERS | VIRTUAL HOSTS | STATUS |
+---------------+-----------+---------------+----------+
| gateway-proxy | :::1935 | 1 | Accepted |
| | :::8080 | | |
| | :::8443 | | |
| ingress-proxy | :::80 | 2 | Accepted |
+---------------+-----------+---------------+----------+
ms@kaffee:~/ms_kube$ glooctl get virtualservice
+-----------------+--------------+---------+------+----------+-----------------+-------------------------------------+
| VIRTUAL SERVICE | DISPLAY NAME | DOMAINS | SSL | STATUS | LISTENERPLUGINS | ROUTES |
+-----------------+--------------+---------+------+----------+-----------------+-------------------------------------+
| video-server | | * | none | Accepted | | /live -> |
| | | | | | | gloo-system.default-video-server-80 |
| | | | | | | (upstream) |
+-----------------+--------------+---------+------+----------+-----------------+-------------------------------------+
ms@kaffee:~/ms_kube$ kubectl get all -n gloo-system
NAME READY STATUS RESTARTS AGE
pod/svclb-gateway-proxy-jmn6t 0/2 Pending 0 2d
pod/svclb-ingress-proxy-zp5fl 2/2 Running 0 2d
pod/gloo-5957f474-6hqrg 1/1 Running 0 2d
pod/ingress-proxy-68b5c957f9-xxdkj 1/1 Running 0 2d
pod/gateway-proxy-5bb4c8f9b7-c2jpl 1/1 Running 0 2d
pod/gateway-5b48d7fc4d-jt2vf 1/1 Running 0 2d
pod/discovery-5bf9b4489f-zfsnf 1/1 Running 0 2d
pod/ingress-66496f667-pl2wn 1/1 Running 0 2d
pod/cm-acme-http-solver-29rlq 1/1 Running 0 2d
pod/cm-acme-http-solver-jvhdm 1/1 Running 0 2d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/gateway ClusterIP 10.43.204.58 <none> 443/TCP 2d
service/gloo ClusterIP 10.43.24.249 <none> 9977/TCP,9988/TCP,9966/TCP,9979/TCP 2d
service/gateway-proxy LoadBalancer 10.43.214.135 <pending> 80:31375/TCP,443:32187/TCP 2d
service/ingress-proxy LoadBalancer 10.43.31.36 xx.xx.xx.xx 80:31554/TCP,443:30811/TCP 2d
service/cm-acme-http-solver-k4z79 NodePort 10.43.184.183 <none> 8089:31323/TCP 2d
service/cm-acme-http-solver-67m89 NodePort 10.43.55.184 <none> 8089:30480/TCP 2d
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/svclb-gateway-proxy 1 1 0 1 0 <none> 2d
daemonset.apps/svclb-ingress-proxy 1 1 1 1 1 <none> 2d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/gloo 1/1 1 1 2d
deployment.apps/ingress-proxy 1/1 1 1 2d
deployment.apps/gateway-proxy 1/1 1 1 2d
deployment.apps/gateway 1/1 1 1 2d
deployment.apps/discovery 1/1 1 1 2d
deployment.apps/ingress 1/1 1 1 2d
NAME DESIRED CURRENT READY AGE
replicaset.apps/gloo-5957f474 1 1 1 2d
replicaset.apps/ingress-proxy-68b5c957f9 1 1 1 2d
replicaset.apps/gateway-proxy-5bb4c8f9b7 1 1 1 2d
replicaset.apps/gateway-5b48d7fc4d 1 1 1 2d
replicaset.apps/discovery-5bf9b4489f 1 1 1 2d
replicaset.apps/ingress-66496f667 1 1 1 2d
@mash-graz can you drop into our slack to get more quick turnaround on feedback and debugging with our team?
you'll probably need to resolve the pod port conflict, not sure which one is giving you troubles but you can change the gateway proxy ports using helm values:
helm value | type | default | description |
---|---|---|---|
gatewayProxies.NAME.podTemplate.httpPort | int | HTTP port for the gateway service | |
gatewayProxies.NAME.podTemplate.httpsPort | int | HTTPS port for the gateway service |
Full list of helm values here: https://docs.solo.io/gloo/latest/reference/helm_chart_values/
We just merged in a docs update to clarify the cert-manager docs, it should be live soon.
thanks -- that's really helpful!
in the meanwhile i had to switch my whole setup to traefik2.2, because i couldn't figure out a solution to this issue, but i will give a try, as soon as i find some spare time to revert this rather complex chain of necessary modifications again.
if your advice works, i would definitely prefer to utilize gloo, because it's IMHO the more efficient solution and significant better integrated in the kubernetes ecosystem (e.g. in traefik2.2 you need some 'static' endpoint configurations for port forwarding, which can not be be changed in a more delegated manner by other service manifests and without a restart...), but on the other hand it's a really seducing kind of comfort, how effortless the integrated automatic letsencrypt handling and the more open authentication middleware capabilities work on this competing other solution. it's really hard to decide, which one of both should be seen as the more suitable solution for ones specific demands?
but thanks again for your help!
Reviewing latest doc on cert_manager integration at https://docs.solo.io/gloo/latest/guides/integrations/cert_manager/ still not showing ACME HTTP01 support.
Is this the page or the merge went on some other doc space?
@linecolumn they've just added it https://github.com/solo-io/gloo/commit/cdd8daac83dfc4a9922169717d8d169c91f77c67
It shows that it is possible, but it doesn't look automatic at all. Wondering if there is a way to automate this (like cert-manager works with Nginx ingress for example) so that it can be called an "integration".
As mentioned, the docs that outline how to integrate with the HTTP01 challenge are live.
The process is admittedly a bit manual, we should have a clearer picture of the path forward based off https://github.com/solo-io/gloo/issues/2993.
After some discussion internally, we will use https://github.com/solo-io/gloo/issues/2993 to track adding an automatic integration with ACME HTTP01 and will close this issue for now.
could you please add a small hint for users in the integration section of the manual, how to realize HTTP01 letsencrypt validation with cert-manager and gloo?
it's a little bit disappointing, if only DNS validation is documented, and honestly it couldn't figure out a working solution for the HTTP01 variant nor find any useful gloo specific descriptions about this topic anywhere on the net.
is it at all compatible with gloos gateway mode?