jetstack / kube-lego

DEPRECATED: Automatically request certificates for Kubernetes Ingress resources from Let's Encrypt
Apache License 2.0
2.16k stars 267 forks source link

GCE kube-lego-gce service created without selector #143

Open pavel-kurnosov opened 7 years ago

pavel-kurnosov commented 7 years ago

As far as I understood, service that should be created should has selector selector: app: kube-lego but instead I have empty selector. That's why ingress saying it is unhealthy, that's why can pass reachability test(fail with 502). Ingress showing my routes correctly and point to proper service, but as soon as service not point to any pod, it is unhealthy, if I understand correctly. Do you know what can be the problem?

gianrubio commented 7 years ago

@pavel-kurnosov could you share the service created by kube-lego?

$ kubectl get svc kube-lego -n your-namespace -o yaml

pavel-kurnosov commented 7 years ago

@gianrubio Yes, I dont have kube-lego, instead I have kube-lego-gce kubectl get svc kube-lego-gce -n aispot -o yaml

apiVersion: v1
kind: Service
metadata:
  annotations:
    kubernetes.io/kube-lego-managed: "true"
  creationTimestamp: 2017-04-08T23:23:48Z
  name: kube-lego-gce
  namespace: aispot
  resourceVersion: "1161519"
  selfLink: /api/v1/namespaces/aispot/services/kube-lego-gce
  uid: 6b3b87b4-1cb2-11e7-bcb4-42010af00040
spec:
  clusterIP: 10.3.249.69
  ports:
  - nodePort: 30957
    port: 8080
    protocol: TCP
    targetPort: 8080
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}
gianrubio commented 7 years ago

It's weird, the selector is hardcoded so it' ll be always created.

What happens if you delete this service and restart kube-lego (it'll recreate the service)? What version are you using?

pavel-kurnosov commented 7 years ago

First, I started installing using helm. After that I remove everything and did manually. First using last version, then tried use canary build. Yes, when I remove service and restrart kube-lego it is recreating same service. Problem that when I udpate it manually, it is replaced after some time again with wrong version ((

gianrubio commented 7 years ago

Sorry, I just realise now you're using GCE, so the svc is created with empty selector

Maybe this issue can help you. If not, please enable log level debug LEGO_LOG_LEVEL=debug and share kube-lego logs.

pavel-kurnosov commented 7 years ago

Yes, I seen this issue, I added already /* to root. This is logs:

2017-04-10T10:24:48.214082666Z time="2017-04-10T10:24:48Z" level=info msg="kube-lego 0.1.4-dev-aa96e546 starting" context=kubelego 
2017-04-10T10:24:48.249273174Z time="2017-04-10T10:24:48Z" level=info msg="connected to kubernetes api v1.6.0" context=kubelego 
2017-04-10T10:24:48.250140182Z time="2017-04-10T10:24:48Z" level=info msg="server listening on http://:8080/" context=acme 
2017-04-10T10:24:48.373518556Z time="2017-04-10T10:24:48Z" level=info msg="disable provider no TLS hosts found" context=provider provider=nginx 
2017-04-10T10:24:48.373658478Z time="2017-04-10T10:24:48Z" level=info msg="process certificate requests for ingresses" context=kubelego 
2017-04-10T10:24:48.376671453Z time="2017-04-10T10:24:48Z" level=info msg="creating new secret" context=secret name=api-dev-secret namespace=aispot 
2017-04-10T10:24:48.376804533Z time="2017-04-10T10:24:48Z" level=info msg="no cert associated with ingress" context="ingress_tls" name=routes-ing namespace=aispot 
2017-04-10T10:24:48.376903030Z time="2017-04-10T10:24:48Z" level=info msg="requesting certificate for mydomain.no" context="ingress_tls" name=routes-ing namespace=aispot 
2017-04-10T10:25:58.903316029Z time="2017-04-10T10:25:58Z" level=warning msg="authorization failed after 1m0s: reachability test failed: wrong status code '502'" context=acme domain=mydomain.no 
2017-04-10T10:25:58.903576316Z time="2017-04-10T10:25:58Z" level=error msg="Error while processing certificate requests: no domain could be authorized successfully" context=kubelego 
2017-04-10T10:25:59.023897111Z time="2017-04-10T10:25:59Z" level=info msg="disable provider no TLS hosts found" context=provider provider=nginx 
2017-04-10T10:25:59.025193047Z time="2017-04-10T10:25:59Z" level=info msg="process certificate requests for ingresses" context=kubelego 
2017-04-10T10:25:59.029105897Z time="2017-04-10T10:25:59Z" level=info msg="creating new secret" context=secret name=api-dev-secret namespace=aispot 
2017-04-10T10:25:59.029261295Z time="2017-04-10T10:25:59Z" level=info msg="no cert associated with ingress" context="ingress_tls" name=routes-ing namespace=aispot 
2017-04-10T10:25:59.029387719Z time="2017-04-10T10:25:59Z" level=info msg="requesting certificate for mydomain.no" context="ingress_tls" name=routes-ing namespace=aispot 
2017-04-10T10:27:09.500504463Z time="2017-04-10T10:27:09Z" level=warning msg="authorization failed after 1m0s: reachability test failed: wrong status code '502'" context=acme domain=mydomain.no 
2017-04-10T10:27:09.500702265Z time="2017-04-10T10:27:09Z" level=error msg="Error while processing certificate requests: no domain could be authorized successfully" context=kubelego 
2017-04-10T10:27:09.531657749Z time="2017-04-10T10:27:09Z" level=info msg="disable provider no TLS hosts found" context=provider provider=nginx 
2017-04-10T10:27:09.531837820Z time="2017-04-10T10:27:09Z" level=info msg="process certificate requests for ingresses" context=kubelego 
2017-04-10T10:27:09.534477379Z time="2017-04-10T10:27:09Z" level=info msg="creating new secret" context=secret name=api-dev-secret namespace=aispot 
2017-04-10T10:27:09.534608592Z time="2017-04-10T10:27:09Z" level=info msg="no cert associated with ingress" context="ingress_tls" name=routes-ing namespace=aispot 
2017-04-10T10:27:09.534729052Z time="2017-04-10T10:27:09Z" level=info msg="requesting certificate for mydomainno" context="ingress_tls" name=routes-ing namespace=aispot 
2017-04-10T10:28:37.288533812Z time="2017-04-10T10:28:37Z" level=warning msg="authorization failed after 1m0s: reachability test failed: wrong status code '502'" context=acme domain=mydomain.no 
2017-04-10T10:28:37.288796524Z time="2017-04-10T10:28:37Z" level=error msg="Error while processing certificate requests: no domain could be authorized successfully" context=kubelego 
2017-04-10T10:28:37.318681250Z time="2017-04-10T10:28:37Z" level=info msg="disable provider no TLS hosts found" context=provider provider=nginx 
2017-04-10T10:28:37.318863316Z time="2017-04-10T10:28:37Z" level=info msg="process certificate requests for ingresses" context=kubelego 
2017-04-10T10:28:37.321200281Z time="2017-04-10T10:28:37Z" level=info msg="creating new secret" context=secret name=api-dev-secret namespace=aispot 
2017-04-10T10:28:37.321298796Z time="2017-04-10T10:28:37Z" level=info msg="no cert associated with ingress" context="ingress_tls" name=routes-ing namespace=aispot 
2017-04-10T10:28:37.321393129Z time="2017-04-10T10:28:37Z" level=info msg="requesting certificate for mydomain.no" context="ingress_tls" name=routes-ing namespace=aispot 
gianrubio commented 7 years ago

@pavel-kurnosov this log are not verbose.. Are you using rbac in k8s 1.6?

pavel-kurnosov commented 7 years ago

No, I don't. As far as I see it is still in beta and require manually specify it, which I didn't.

gianrubio commented 7 years ago

@pavel-kurnosov during this week I'll bootstrap a gce k8s 1.6.0 so I'll figure it out what happens. The gce provider was not detected by kube-lego, most of the lines has provider=nginx it should be provider=gce

pavel-kurnosov commented 7 years ago

Okay I see, can I using properties to force kube-lego to use gce?

pavel-kurnosov commented 7 years ago

Okay, I found a way to make it work for now:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kube-lego
  namespace: kube-lego
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: kube-lego
    spec:
      containers:
      - name: kube-lego
        image: jetstack/kube-lego:canary
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
        env:
        - name: LEGO_EMAIL
          valueFrom:
            configMapKeyRef:
              name: kube-lego
              key: lego.email
        - name: LEGO_LOG_LEVEL
          valueFrom:
            configMapKeyRef:
              name: kube-lego
              key: lego.log_level
        - name: LEGO_DEFAULT_INGRESS_CLASS
          valueFrom:
            configMapKeyRef:
              name: kube-lego
              key: lego.default_class
        - name: LEGO_SUPPORTED_INGRESS_CLASS
          valueFrom:
            configMapKeyRef:
              name: kube-lego
              key: lego.default_support
        - name: LEGO_URL
          valueFrom:
            configMapKeyRef:
              name: kube-lego
              key: lego.url
        - name: LEGO_NAMESPACE
          valueFrom:
            configMapKeyRef:
              name: kube-lego
              key: lego.namespace
        - name: LEGO_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 1

And configmap:

apiVersion: v1
metadata:
  name: kube-lego
  namespace: kube-lego
data:
  # modify this to specify your address
  lego.email: "my@emai.com"

  # configre letencrypt's staging api, for prod https://acme-v01.api.letsencrypt.org/directory
  lego.url: "https://acme-staging.api.letsencrypt.org/directory"

  # specify class explicitly
  lego.default_class: "gce"

  # specify support class explicitly
  lego.default_support: "gce" 

  lego.log_level: "debug"

 # specify  explicitly namespace
  lego.namespace: "myNamespace"
kind: ConfigMap

After specifying all this config and restar, in 5 minutes when all updates was propagated, I got my first certificate and then all was fine. Hope, it is going to help someone. And you @gianrubio to debug :)

pavel-kurnosov commented 7 years ago

Okay, today after update ingress(new host), kube-lego didn't manage to start properly:

2017-04-11T21:24:29.677958774Z time="2017-04-11T21:24:29Z" level=debug msg="setting up svc endpoint" context=provider namespace=aispot pod_ip=10.0.0.51 provider=gce 
2017-04-11T21:24:29.716822971Z time="2017-04-11T21:24:29Z" level=debug msg=reset context=provider provider=nginx 
2017-04-11T21:24:29.716954340Z time="2017-04-11T21:24:29Z" level=debug msg=finalize context=provider provider=nginx 
2017-04-11T21:24:29.722592832Z time="2017-04-11T21:24:29Z" level=info msg="disable provider no TLS hosts found" context=provider provider=nginx 
2017-04-11T21:24:29.722689144Z time="2017-04-11T21:24:29Z" level=info msg="process certificate requests for ingresses" context=kubelego 
2017-04-11T21:24:29.726964968Z time="2017-04-11T21:24:29Z" level=info msg="cert does not cover all domains" context="ingress_tls" domains=[domain domain1] name=routes-ing namespace=aispot 
2017-04-11T21:24:29.727076126Z time="2017-04-11T21:24:29Z" level=info msg="requesting certificate for domain,domain1" context="ingress_tls" name=routes-ing namespace=aispot 
2017-04-11T21:24:29.970108028Z time="2017-04-11T21:24:29Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:29.970879320Z time="2017-04-11T21:24:29Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:29.988963579Z time="2017-04-11T21:24:29Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:29.990280847Z time="2017-04-11T21:24:29Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:30.456037337Z time="2017-04-11T21:24:30Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:30.489155406Z time="2017-04-11T21:24:30Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:30.657360007Z time="2017-04-11T21:24:30Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:30.667848864Z time="2017-04-11T21:24:30Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:31.349074046Z time="2017-04-11T21:24:31Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:31.359231891Z time="2017-04-11T21:24:31Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:31.474874665Z time="2017-04-11T21:24:31Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:31.484321840Z time="2017-04-11T21:24:31Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:32.010579574Z time="2017-04-11T21:24:32Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:32.021152806Z time="2017-04-11T21:24:32Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:32.886234324Z time="2017-04-11T21:24:32Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:32.895511425Z time="2017-04-11T21:24:32Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:33.026831130Z time="2017-04-11T21:24:33Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:33.037078202Z time="2017-04-11T21:24:33Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:34.724697491Z time="2017-04-11T21:24:34Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:34.735611358Z time="2017-04-11T21:24:34Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:34.962259067Z time="2017-04-11T21:24:34Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:34.977988428Z time="2017-04-11T21:24:34Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:36.773031143Z time="2017-04-11T21:24:36Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:36.782294844Z time="2017-04-11T21:24:36Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:37.694468243Z time="2017-04-11T21:24:37Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:37.705378491Z time="2017-04-11T21:24:37Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:38.946813694Z time="2017-04-11T21:24:38Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:38.976756629Z time="2017-04-11T21:24:38Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:42.957553115Z time="2017-04-11T21:24:42Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:42.974799906Z time="2017-04-11T21:24:42Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:47.283721915Z time="2017-04-11T21:24:47Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:47.301028974Z time="2017-04-11T21:24:47Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:47.386623540Z time="2017-04-11T21:24:47Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:47.398377298Z time="2017-04-11T21:24:47Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:51.750171886Z time="2017-04-11T21:24:51Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:51.761099114Z time="2017-04-11T21:24:51Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:24:57.653118371Z time="2017-04-11T21:24:57Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:24:57.671210390Z time="2017-04-11T21:24:57Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
2017-04-11T21:24:59.314200868Z time="2017-04-11T21:24:59Z" level=debug msg="testing reachability of http://domain/.well-known/acme-challenge/_selftest" context=acme domain=domain 
2017-04-11T21:24:59.332454072Z time="2017-04-11T21:24:59Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain 
2017-04-11T21:25:15.555292971Z time="2017-04-11T21:25:15Z" level=debug msg="testing reachability of http://domain1/.well-known/acme-challenge/_selftest" context=acme domain=domain1 
2017-04-11T21:25:15.571910352Z time="2017-04-11T21:25:15Z" level=debug msg="error while authorizing: reachability test failed: wrong status code '502'" context=acme domain=domain1 
ghost commented 7 years ago

I just added a PR to associate the app: kube-lego selector with the GCE service. This fixed the issue for me.

gianrubio commented 7 years ago

@simonswine could you explain why kibe-lego creates the service selector empty when provider is GCE?

Does #147 fix this issue?

simonswine commented 7 years ago

It is the only way to support kube-lego between namespaces with GCE ingress.

147 is probably only working if you run kube-lego in the same namespace where you run your app and the ingress object. It is definitely breaking multi namespace support for GCE ingress.

Kube-Lego is managing the endpoints list for the kube-lego-gce service. To debug that behaviour:

kubectl describe svc kube-lego-gce and kubectl get endpoints kube-lego-gce -o yaml should provide more insights what the problem is here. I think the endpoint should be there (setting up svc endpoint in the logs).

Are you running more than one instance of kube-lego in your cluster?

cf. https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ (Empty selector)

ghost commented 7 years ago

The namespace question is interesting. I installed kube-lego using the yaml included in the project examples/gce/lego folder, which specifies the kube-lego namespace. I can see that the kube-lego deployment and pod are running in the kube-lego namespace. However the kube-lego-gce service is running in the default namespace. Our application services and ingress are all in the default namespace. We only have one instance of kube-lego running.

kubectl describe svc kube-lego-gce:

Name:           kube-lego-gce
Namespace:      default
Labels:         <none>
Annotations:        kubernetes.io/kube-lego-managed=true
Selector:       app=kube-lego
Type:           NodePort
IP:         10.7.246.163
Port:           <unset> 8080/TCP
NodePort:       <unset> 31677/TCP
Endpoints:      <none>
Session Affinity:   None
Events:         <none>

kubectl get endpoints kube-lego-gce -o yaml:

apiVersion: v1
kind: Endpoints
metadata:
  creationTimestamp: 2017-04-12T17:04:28Z
  name: kube-lego-gce
  namespace: default
  resourceVersion: "2597926"
  selfLink: /api/v1/namespaces/default/endpoints/kube-lego-gce
  uid: 16ea4244-1fa2-11e7-b4e1-42010a800035
subsets: []

With #147 the issue is corrected for this configuration.

munnerz commented 7 years ago

Hey @jessejohnston-isp,

The issue here is that service selectors do not work between namespaces. So if you add a selector to the kube-lego-gce service in the default namespace, it will match no pods, as kube-lego itself runs in the kube-lego namespace.

This is a known caveat with Ingress/services, and as far as I can tell not one they intend to 'fix'. You can see @thockin suggesting a workaround here, which is what kube-lego implements (manually managing Endpoints to point to the service/pod IP in another namespace).

147 will fix the issue iff the kube-lego pod exists in the same namespace as the Ingress resource you have created, because otherwise, kube-controller-manager will return an empty list of pods when querying for pods in the namespace of your service in order to create the Endpoints resource.

Hope I'm being clear enough here - it's not the easiest to explain! I'd imagine your PR is working for you because kube-lego is still manually creating the Endpoints resource for you, as it's intended to do (regardless of the presence of the selector). If you were to remove this line https://github.com/jessejohnston-isp/kube-lego/blob/039933e9108b6234a7aa1339654f6eb7a8c8ae61/pkg/provider/gce/gce.go#L96 (where the Endpoints resource is manually created), I think you'll find #147 no longer works cross namespace.

ghost commented 7 years ago

Hi @munnerz,

That absolutely makes sense...and in the kubernetes dashboard I see that the kube-lego-gce service still matches no pods. However, without #147 it did not work at all. We saw an endless stream of 502 errors from kube-lego because /.well-known/acme-challenge/_selftest was not reachable externally. My speculation, like that of @pavel-kurnosov, was that it was because kube-lego-gce was not selecting any pods, and therefore the request to acme could not be routed anywhere. After applying the PR the problem immediately resolved.

pavel-kurnosov commented 7 years ago

Personally, I think this is hack to apply it to any namespace. For me, namespace is separate area and each namespace should not have access to other in any case. I don't mind to have kube-lego in same namespace to make it working, but because of empty selector, now it will not.

Can we as workaround have a property, which will add selector, so we can control it.

pavel-kurnosov commented 7 years ago

@gianrubio Did you have a change to check new version on GCP?

tiktb8 commented 7 years ago

I'm also having the same issue. Using the deployment and config map noted earlier by @pavel-kurnosov I was able to get lego to deploy one cert. When I added my second service and ingress to GCE the backend was unreachable and thus no cert was obtained. I'm happy to help debug as I believe the kube-lego solution is more intuitive than the other option available.

sdbondi commented 7 years ago

Looking here: https://github.com/jetstack/kube-lego/blob/2d228d8a5779e4b231f5c5bc70c7ea8c0de1004a/pkg/provider/gce/gce.go#L90

I see it the gce component calls SetKubeLegoSpec which sets the Selector and then the code at L90 sets the Selector to an empty map.

UPDATE: I removed the above line an pushed to docker.io fixate/kube-lego:1.0.6-dev-fixate and the service seemed to work correctly

PanJ commented 7 years ago

I'm also facing this problem and investigated. Looks like it have something to do with this issue instead https://github.com/jetstack/kube-lego/issues/68 The no-selector implementation is correct but the problem is that the endpoint is shown as UNHEALTHY.

hprotzek commented 7 years ago

Same problem here, I couldn't get to it to work. Will try the workaround with #147

jamatute commented 6 years ago

Same problem here, with GKE 1.7.8.

I've got kube-lego in one namespace, I managed to deploy one service in another namespace. But when I tried to deploy a second service in a third namespace it keeps on having 502 errors .

Thanks @PanJ, the "solution" of https://github.com/jetstack/kube-lego/issues/68 fixed it for me

anselms commented 6 years ago

@jamatute I seem to be having the same problem (the service in the second namespace I create uses the certificate created for the first service - which is issued to a different subdomina). Which "solution" did fix this problem for you, I could not find that out from the #68 issue?

jamatute commented 6 years ago

@anselms I manually created a healthcheck to the desired endpoint and attached it to the kube-lego-gce backend.

After a pair of minutes it marks it as healthy and the certificate is issued.

You have to do it manually, so it's more a workaround than a solution

futuretec commented 6 years ago

Same issue with my GKE deployment with kube-lego in a separate 'kube-lego' namespace and my service/deployment in another namespace.

Is there any perspective how to resolve this with a proper (and flexible) configuration/without manual workarounds?

mixth commented 6 years ago

According to @jamatute I managed to temporarily fix this by editing kube-lego-gce's health check path to '/healthz'. This allows load balancer to sense kube-lego-gce healthiness then properly send request to it.

To check which health check is kube-lego-gce's, try looking for its port. The health check has port number in its name.

meysholdt commented 6 years ago

I ran into this issue, too.

This is a known caveat with Ingress/services, and as far as I can tell not one they intend to 'fix'. You can see thockin suggesting a workaround here, which is what kube-lego implements (manually managing Endpoints to point to the service/pod IP in another namespace).

What's the best way to figure out if or why "manual Endpoint management" by kube-lego fails? In my case the service actually resolves to the wrong pod :(

adrienjoly commented 6 years ago

Just so you know, I just experienced that issue too, with version 1.6 of kube-lego.