gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.59k stars 1.76k forks source link

goteleport dosent work in ingress level #26130

Closed bittu664 closed 1 year ago

bittu664 commented 1 year ago

hello team, i am using traefik ingress-controller. and i want to expose this goteleport with my ingress, but when i am applying my ingress file it shows me bad gateway

Screenshot 2023-05-12 at 7 07 32 PM

Here is my ingress file:-

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: teleport-ui-ingress
  namespace: teleport-cluster
  annotations:
    #kubernetes.io/ingress.class: "traefik"
    kubernetes.io/tls-acme: "true"
    traefik.ingress.kubernetes.io/router.tls: "true"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    traefik.ingress.kubernetes.io/router.entrypoints: websecure

spec:
  ingressClassName: traefik
  tls:
    - hosts:
        - mydoiman.com
      secretName: teleport-ui-tls
  rules:
    - host: mydoiman.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: teleport-cluster
                port:
                  number: 443

pls provide me full guide.

webvictim commented 1 year ago

This generally happens because your ingress (Traefik) is sending an HTTP request to a backend service that speaks HTTPS.

I'm not a Traefik expert, but if you can make sure that it's sending HTTPS requests to the backend (and ignoring self-signed certificates) then it should work.

bittu664 commented 1 year ago

so which port no is right to acces the web ui for goteleport , 443 right? or something else because i can see this only port are there.

Screenshot 2023-05-12 at 7 19 01 PM

And is there any docs , how to expose through ingress. the main fact is in my web-browser it shows ssl certs with green lock symbol. i need proper solutions.

webvictim commented 1 year ago

Yes, external port 443 is correct.

You should also set proxyListenerMode: multiplex in your chart values and do a helm upgrade to multiplex all traffic on port 443. That will remove the other ports from the service.

We are still working on documentation for setting up Teleport behind an ingress.

bittu664 commented 1 year ago

@webvictim i upgaded the helm chart with this value --set proxyListenerMode=multiplex , but it didnt work. still facing the same issue.

webvictim commented 1 year ago

Did you figure out how to make Traefik send HTTPS requests to the backend?

bittu664 commented 1 year ago

yes by using this annotations: traefik.ingress.kubernetes.io/router.entrypoints: websecure

webvictim commented 1 year ago

I don't believe that's correct.

I went and read the Traefik documentation for you. Try adding this to your values and doing a helm upgrade:

annotations:
  service:
    traefik.ingress.kubernetes.io/service.serversscheme: https
MusicDin commented 1 year ago

Hi,

there are 2 ways use Traefik in front of your Teleport cluster - TLS termination or TCP passthrough. I will try to explain both options, if someone needs this in the future.

TLS Termination

When using Traefik with TLS termination, it has to be ensured that Traefik calls HTTPS backend. But where the problem usually arises is that even if Traefik is instructed to use HTTPS scheme, Traefik will report 500 Internal Server Error. This is because Traefik won't be able to verify the self-signed certificate from the Teleport.

To make this work, you have to configure ServersTransport where you either disable TLS verification or provide Teleport's self-signed certificates.

apiVersion: traefik.io/v1alpha1
kind: ServersTransport
metadata:
  name: teleport-insecure-https
  namespace: teleport
spec:
  insecureSkipVerify: true

You can configure ingress either with Kuberntes Ingress or Kubernetes IngressRoute (custom resource).

IngressRoute

Using IngressRoute the configuration would look something like this:

apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: teleport
  namespace: teleport
spec:
  entryPoints:
    - websecure
  routes:
  - kind: Rule
    match: "HostRegexp(`example.com`, `{subdomain:[a-zA-Z0-9-]+}.example.com`)"
    services:
    - name: teleport
      port: 443
      nativeLB: true
      #
      # Instruct Traefik to use HTTPS when calling backend
      scheme: https  
      #
      # Reference ServersTransport that disables TLS verification
      serversTransport: teleport-insecure-https 

Kubernetes Ingress

Using Ingress the configuration would look something like this:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: teleport
  namespace: teleport
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: websecure
    traefik.ingress.kubernetes.io/router.tls: "true"
spec:
  ingressClassName: traefik
  rules:
  - host: "example.com"
    http:
      paths:
      - path: "/"
        pathType: Prefix
        backend:
          service:
            name: teleport
            port:
              number: 443
  - host: "*.example.com"
    http:
      paths:
      - path: "/"
        pathType: Prefix
        backend:
          service:
            name: teleport
            port:
              number: 443

But note that when using normal (native) Kubernetes Ingress, you have to add the following annotations to the Teleport Service (not an Ingress object):

apiVersion: v1
kind: Service
metadata:
  name: teleport
  namespace: teleport
  annotations:
    traefik.ingress.kubernetes.io/service.nativelb: "true"
    #
    # Instruct Traefik to use HTTPS when calling backend
    traefik.ingress.kubernetes.io/service.serversscheme: https
    #
    # Reference ServersTransport that disables TLS verification
    # TransportServers reference: <transport-servers-namespace>-<transport-servers-name>@<providers-namespace>
    traefik.ingress.kubernetes.io/service.serverstransport: teleport-teleport-insecure-https@kubernetescrd
  ...
spec:
  ...

TCP Passthrough

With TCP passthrough, Traefik just passes all the traffic to the Teleport cluster. However, in such case, Teleport has to be configured with an appropriate certificate.

apiVersion: traefik.io/v1alpha1
kind: IngressRouteTCP
metadata:
  name: teleport
  namespace: teleport
spec:
  entryPoints:
    - websecure
  routes:
  - match: "HostSNIRegexp(`example.com`, `{subdomain:[a-zA-Z0-9-]+}.example.com`)"
    services:
    - name: teleport
      port: 443
      nativeLB: true
  tls:
    passthrough: true

Hope this helps.

bittu664 commented 1 year ago

hello @MusicDin i tried your method , but it doesnt work at all.

MusicDin commented 1 year ago

Could you elaborate what specifically does not work?

What is the error? What is in the Teleport/Traefik logs? Which Teleport version are you using? As far as I know, only version 13 supports reverse proxies (with TLS termination)

bittu664 commented 1 year ago

the error is same : -Bad gateway

logs of teleport :-

2023-05-16T10:29:38Z WARN [ALPN:PROX] Failed to handle client connection. error:[
2023-05-16T10:29:38.300258812Z ERROR REPORT:
2023-05-16T10:29:38.300266846Z Original Error: *errors.errorString acme/autocert: missing server name
2023-05-16T10:29:38.300285882Z Stack Trace:
2023-05-16T10:29:38.300290651Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:392 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).handleConn
2023-05-16T10:29:38.300294198Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:326 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).Serve.func1
2023-05-16T10:29:38.300298425Z  runtime/asm_amd64.s:1598 runtime.goexit
User Message: acme/autocert: missing server name] alpnproxy/proxy.go:337
2023-05-16T10:29:38.461104435Z 2023-05-16T10:29:38Z WARN [ALPN:PROX] Failed to handle client connection. error:[
2023-05-16T10:29:38.461131235Z ERROR REPORT:
Original Error: *errors.errorString acme/autocert: missing server name
2023-05-16T10:29:38.461140172Z Stack Trace:
2023-05-16T10:29:38.461143679Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:392 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).handleConn
2023-05-16T10:29:38.461146494Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:326 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).Serve.func1
2023-05-16T10:29:38.461149639Z  runtime/asm_amd64.s:1598 runtime.goexit
2023-05-16T10:29:38.461152455Z User Message: acme/autocert: missing server name] alpnproxy/proxy.go:337
2023-05-16T10:29:38.810339361Z 2023-05-16T10:29:38Z WARN [ALPN:PROX] Failed to handle client connection. error:[
2023-05-16T10:29:38.810374967Z ERROR REPORT:
2023-05-16T10:29:38.810383563Z Original Error: *errors.errorString acme/autocert: missing server name
2023-05-16T10:29:38.810388672Z Stack Trace:
2023-05-16T10:29:38.810393461Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:392 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).handleConn
2023-05-16T10:29:38.810397919Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:326 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).Serve.func1
2023-05-16T10:29:38.810403099Z  runtime/asm_amd64.s:1598 runtime.goexit
2023-05-16T10:29:38.810407758Z User Message: acme/autocert: missing server name] alpnproxy/proxy.go:337
2023-05-16T10:29:38.972935586Z 2023-05-16T10:29:38Z WARN [ALPN:PROX] Failed to handle client connection. error:[
2023-05-16T10:29:38.972968007Z ERROR REPORT:
2023-05-16T10:29:38.972973256Z Original Error: *errors.errorString acme/autocert: missing server name
2023-05-16T10:29:38.972977073Z Stack Trace:
2023-05-16T10:29:38.972984117Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:392 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).handleConn
2023-05-16T10:29:38.972987754Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:32
github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).Serve.func1
2023-05-16T10:29:38.972991931Z  runtime/asm_amd64.s:1598 runtime.goexit
2023-05-16T10:29:38.972995588Z User Message: acme/autocert: missing server name] alpnproxy/proxy.go:337
2023-05-16T10:29:39.017398715Z 2023-05-16T10:29:39Z WARN [ALPN:PROX] Failed to handle client connection. error:[
2023-05-16T10:29:39.017435535Z ERROR REPORT:
2023-05-16T10:29:39.017443530Z Original Error: *errors.errorString acme/autocert: missing server name
2023-05-16T10:29:39.017449601Z Stack Trace:
2023-05-16T10:29:39.017454681Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:392 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).handleConn
2023-05-16T10:29:39.017459169Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:326 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).Serve.func1
2023-05-16T10:29:39.017464209Z  runtime/asm_amd64.s:1598 runtime.goexit
2023-05-16T10:29:39.017484355Z User Message: acme/autocert: missing server name] alpnproxy/proxy.go:337
2023-05-16T10:29:41.419288856Z 2023-05-16T10:29:41Z WARN [ALPN:PROX] Failed to handle client connection. error:[
2023-05-16T10:29:41.419319523Z ERROR REPORT:
2023-05-16T10:29:41.419323992Z Original Error: *errors.errorString acme/autocert: missing server name
2023-05-16T10:29:41.419326857Z Stack Trace:
2023-05-16T10:29:41.419329381Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:392 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).handleConn
2023-05-16T10:29:41.419331746Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:326 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).Serve.func1
2023-05-16T10:29:41.419334431Z  runtime/asm_amd64.s:1598 runtime.goexit
2023-05-16T10:29:41.419336906Z User Message: acme/autocert: missing server name] alpnproxy/proxy.go:337
2023-05-16T10:29:41.737087312Z 2023-05-16T10:29:41Z WARN [ALPN:PROX] Failed to handle client connection. error:[
2023-05-16T10:29:41.737112639Z ERROR REPORT:
2
023-05-16T10:29:41.737118049Z Original Error: *errors.errorString acme/autocert: missing server name
2023-05-16T10:29:41.737121345Z Stack Trace:
2023-05-16T10:29:41.737124390Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:392 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).handleConn
2023-05-16T10:29:41.737127386Z  github.com/gravitational/teleport/lib/srv/alpnproxy/proxy.go:326 github.com/gravitational/teleport/lib/srv/alpnproxy.(*Proxy).Serve.func1
2023-05-16T10:29:41.737130823Z  runtime/asm_amd64.s:1598 runtime.goexit
2023-05-16T10:29:41.737133648Z User Message: acme/autocert: missing server name] alpnproxy/proxy.go:337

teleport version is 13

Screenshot 2023-05-16 at 4 17 48 PM

MusicDin commented 1 year ago

To me this seems related to the teleport configuration (specifically acme part of it) and not Traefik. As you can see, the error acme/autocert: missing server name gets repeated multiple times. Maybe this issue can help you solve the problem: https://github.com/gravitational/teleport/issues/10352.


However, I don't understand what exactly are you trying to achieve?

If you are using a reverse proxy with TLS termination, you should configure Traefik with a certificate of your choice (for example, using cert-manager or manually) and let Teleport generate self-signed certificates. The purpose of using reverse proxies is to centralize the management of SSL/TLS certificates.

In this case, the reverse proxy (Traefik) will terminate the SSL connection with the client, which uses a certificate obtained from a public certificate authority (CA) like Let's Encrypt. Then, it will establish a new SSL connection with Teleport, which uses self-signed certificates. This is why you need to skip TLS verification between Traefik and Teleport (because you internally trust the self-signed certificate).

If you want Teleport to terminate the SSL connection instead, you can use TCP passthrough. This instructs Traefik to simply pass the traffic to Teleport without decrypting the packets, based on the target host. This is possible because the target host address ("example.com") is not encrypted.

webvictim commented 1 year ago

@bittu664 Please also share the values file that you are using to install the Helm chart. As @MusicDin says, I suspect there is a misconfiguration.

bittu664 commented 1 year ago

in my k8s cluster cert-manager is installed, and my domain is encrypted with letsencrypt ,

this command i am using to install goteleport:-

helm  install teleport-cluster teleport/teleport-cluster  --namespace=teleport-cluster --set clusterName=secure.mydomain.com --set service.type=ClusterIP --set acme=true --set acmeEmail=cloud@gmail.com   --version 13.0.0
webvictim commented 1 year ago

Remove both the acme and acmeEmail values - you don't need them when using cert-manager. Then redeploy the chart.

bittu664 commented 1 year ago

@webvictim ok , so should i also add --set proxyListenerMode=multiplex just like:-

helm  install teleport-cluster teleport/teleport-cluster  --namespace=teleport-cluster --set clusterName=secure.mydomain.com --set service.type=ClusterIP --set proxyListenerMode=multiplex   --version 13.0.0
webvictim commented 1 year ago

Yes.

Also, if your clusterName is set to secure.mydomain.com then your Ingress should also set secure.mydomain.com as its host - not mydomain.com as in your original example.

bittu664 commented 1 year ago

yes i know its already there, on that time i just give as an example

bittu664 commented 1 year ago

after installing this chart, now i am facing this :- Screenshot 2023-05-16 at 6 42 26 PM

and then after that i apply this file :-

apiVersion: traefik.io/v1alpha1
kind: ServersTransport
metadata:
  name: teleport-insecure-https
  namespace: teleport-cluster
spec:
  insecureSkipVerify: true

and added the annotations in this existing service , but still same error

Screenshot 2023-05-16 at 6 44 30 PM

Screenshot 2023-05-16 at 6 46 42 PM

Here is my ingress file:-

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: teleport-ui-ingress
  namespace: teleport-cluster
  annotations:
    kubernetes.io/tls-acme: "true"
    traefik.ingress.kubernetes.io/router.tls: "true"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    traefik.ingress.kubernetes.io/router.entrypoints: websecure

spec:
  ingressClassName: traefik
  tls:
    - hosts:
        - secure.mydomain.com
      secretName: teleport-ui-tls
  rules:
    - host: secure.mydomain.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: teleport-cluster
                port:
                  number: 443
MusicDin commented 1 year ago

You also have to add an annotation on the service referencing your SeversTransport:

# Reference ServersTransport that disables TLS verification
# TransportServers reference: <transport-servers-namespace>-<transport-servers-name>@<providers-namespace>
traefik.ingress.kubernetes.io/service.serverstransport: teleport-cluster-teleport-insecure-https@kubernetescrd

As a result your service should have the follwoing annotations:

apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: teleport-cluster
    meta.helm.sh/release-namespace: teleport-cluster
    traefik.ingress.kubernetes.io/service.nativelb: "true"
    traefik.ingress.kubernetes.io/service.serversscheme: https
    traefik.ingress.kubernetes.io/service.serverstransport: teleport-cluster-teleport-insecure-https@kubernetescrd
...
bittu664 commented 1 year ago

great thankyou, @MusicDin finally its working now.

bittu664 commented 1 year ago

but one more thing, i cannot exec into pods here i am using rancher, and it shows connecting... also i tried with kubectl exec and it didnt work.

Screenshot 2023-05-17 at 1 12 34 AM
bittu664 commented 1 year ago

@webvictim @MusicDin what i am thinking that because of proxyListenerMode=multiplex it is not working . this is my believe what your thoughts??

dennisme commented 1 year ago

@bittu664 what command are you running when it hangs? can you add the --debug flag?

@webvictim @MusicDin Ive replicated the above setup and I am able to get a successful login and browse the teleport UI (via the github oauth setup), but the command tsh login fails with the following.

 tsh login --proxy=teleport.example.com:443 staging --debug
2023-05-18T16:09:08-07:00 INFO [CLIENT]    No teleport login given. defaulting to bob client/api.go:996
2023-05-18T16:09:08-07:00 INFO [CLIENT]    no host login given. defaulting to bob client/api.go:1006
2023-05-18T16:09:08-07:00 INFO [CLIENT]    [KEY AGENT] Connected to the system agent: "/private/tmp/com.apple.launchd.bm9DsaXsVh/Listeners" client/api.go:4268
2023-05-18T16:09:08-07:00 DEBU [TSH]       Pinging the proxy to fetch listening addresses for non-web ports. tsh/tsh.go:3541
2023-05-18T16:09:08-07:00 DEBU [CLIENT]    not using loopback pool for remote proxy addr: teleport.example.com:443 client/api.go:4223
2023-05-18T16:09:08-07:00 DEBU             Attempting GET teleport.example.com:443/webapi/ping webclient/webclient.go:128
2023-05-18T16:09:09-07:00 DEBU             ALPN connection upgrade required for "teleport.example.com:443": true. No ALPN protocol is negotiated by the server. client/alpn_conn_upgrade.go:66
2023-05-18T16:09:09-07:00 DEBU [CLIENT]    Attempting to login with a new RSA private key. client/api.go:3608
2023-05-18T16:09:09-07:00 DEBU [CLIENT]    not using loopback pool for remote proxy addr: teleport.example.com:443 client/api.go:4223
2023-05-18T16:09:09-07:00 DEBU [CLIENT]    HTTPS client init(proxyAddr=teleport.example.com:443, insecure=false, extraHeaders=map[]) client/weblogin.go:308
2023-05-18T16:09:09-07:00 INFO [CLIENT]    Waiting for response at: http://127.0.0.1:52420. client/redirect.go:157
If browser window does not open automatically, open it by clicking on the link:
 http://127.0.0.1:52420/30a3f58c-835c-4acb-b09c-ca4650e70156
2023-05-18T16:09:12-07:00 DEBU [CLIENT]    Got response from browser. client/weblogin.go:396
2023-05-18T16:09:12-07:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2023-05-19 11:05:09 +0000 UTC". client/client_store.go:91
2023-05-18T16:09:12-07:00 DEBU [KEYAGENT]  Deleting obsolete stored key with index {ProxyHost:teleport.example.com Username:bob ClusterName:staging}. client/keyagent.go:527
2023-05-18T16:09:12-07:00 DEBU [KEYSTORE]  Adding known host staging with proxy teleport.example.com client/trusted_certs_store.go:393
2023-05-18T16:09:12-07:00 INFO [KEYAGENT]  Loading SSH key for user "bob" and cluster "staging". client/keyagent.go:195
2023-05-18T16:09:12-07:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2023-05-19 11:09:11 +0000 UTC". client/client_store.go:91
2023-05-18T16:09:12-07:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2023-05-19 11:09:11 +0000 UTC". client/client_store.go:91
2023-05-18T16:09:12-07:00 INFO [CLIENT]    Connecting to proxy=teleport.example.com:443 login="bob" using TLS Routing client/api.go:3023
2023-05-18T16:09:12-07:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2023-05-19 11:09:11 +0000 UTC". client/client_store.go:91
2023-05-18T16:09:12-07:00 DEBU [HTTP:PROX] No proxy set in environment, returning direct dialer. proxy/proxy.go:195
2023-05-18T16:09:12-07:00 DEBU             ALPN connection upgrade for teleport.example.com:443. client/alpn_conn_upgrade.go:164

ERROR REPORT:
Original Error: trace.aggregate failed to switch Protocols 500
Stack Trace:
    github.com/gravitational/teleport/api@v0.0.0/client/alpn_conn_upgrade.go:178 github.com/gravitational/teleport/api/client.(*alpnConnUpgradeDialer).DialContext
    github.com/gravitational/teleport/api@v0.0.0/client/contextdialer.go:192 github.com/gravitational/teleport/api/client.NewDialer.func1
    github.com/gravitational/teleport/api@v0.0.0/client/contextdialer.go:157 github.com/gravitational/teleport/api/client.tracedDialer.func1
    github.com/gravitational/teleport/api@v0.0.0/client/contextdialer.go:99 github.com/gravitational/teleport/api/client.ContextDialerFunc.DialContext
    github.com/gravitational/teleport/api@v0.0.0/client/alpn.go:141 github.com/gravitational/teleport/api/client.(*ALPNDialer).DialContext
    github.com/gravitational/teleport/lib/utils/proxy/proxy.go:71 github.com/gravitational/teleport/lib/utils/proxy.directDial.DialTimeout
    github.com/gravitational/teleport/lib/utils/proxy/proxy.go:58 github.com/gravitational/teleport/lib/utils/proxy.directDial.Dial
    github.com/gravitational/teleport/lib/client/api.go:3105 github.com/gravitational/teleport/lib/client.makeProxySSHClientWithTLSWrapper
    github.com/gravitational/teleport/lib/client/api.go:3024 github.com/gravitational/teleport/lib/client.makeProxySSHClient
    github.com/gravitational/teleport/lib/client/api.go:2971 github.com/gravitational/teleport/lib/client.(*TeleportClient).connectToProxy
    github.com/gravitational/teleport/lib/client/api.go:2949 github.com/gravitational/teleport/lib/client.(*TeleportClient).ConnectToProxy.func1
    runtime/asm_arm64.s:1172 runtime.goexit
User Message: Unable to connect to ssh proxy at teleport.example.com:443. Confirm connectivity and availability.
    failed to switch Protocols 500

Looking at the code its getting a 500, but I am able to connect. Additionally the initial connection passes.

 nc -zv teleport.example.com 443
Connection to teleport.example.com port 443 [tcp/https] succeeded!

Are there any other things I can test? Happy to provide more debug info!

webvictim commented 1 year ago

@dennisme What reverse proxy/ingress are you using? It appears not to like websocket upgrades:

2023-05-18T16:09:09-07:00 DEBU             ALPN connection upgrade required for "teleport.example.com:443": true. No ALPN protocol is negotiated by the server. client/alpn_conn_upgrade.go:66

It's also replying with a 500 (Internal Server Error) when trying to switch protocols i.e. start a websocket connection.

Support for websockets is mandatory to be able to put Teleport 13 behind an HTTPS proxy like this. I'd suggest checking the configuration to make sure websockets are enabled and working correctly, and/or checking the logs of that server to see whether there's an underlying problem.

webvictim commented 1 year ago

@bittu664 Sorry, I'm not familiar with Rancher or how it works. It's very possible that a direct kubectl exec connection via Teleport won't work behind an HTTPS proxy, as it's mandatory to use a local listener (tsh proxy kube) for Kubernetes connections in this scenario.

dennisme commented 1 year ago

@dennisme What reverse proxy/ingress are you using? It appears not to like websocket upgrades:

2023-05-18T16:09:09-07:00 DEBU             ALPN connection upgrade required for "teleport.example.com:443": true. No ALPN protocol is negotiated by the server. client/alpn_conn_upgrade.go:66

It's also replying with a 500 (Internal Server Error) when trying to switch protocols i.e. start a websocket connection.

Support for websockets is mandatory to be able to put Teleport 13 behind an HTTPS proxy like this. I'd suggest checking the configuration to make sure websockets are enabled and working correctly, and/or checking the logs of that server to see whether there's an underlying problem.

Thanks a ton for the lightning fast response. Traffic flow is as follows: client -> linode node balancer (tcp pass through) -> traefik (with the config mentioned above for tls term) -> backend teleport-cluster service.

dennisme commented 1 year ago

@webvictim I setup debug logging on traefik, re ran our test, and got this:

time="2023-05-19T00:14:56Z" level=debug msg="'500 Internal Server Error' caused by: backend tried to switch protocol \"\" when \"alpn-ping\" was requested"
<redacted>- - [19/May/2023:00:14:56 +0000] "GET /webapi/connectionupgrade HTTP/1.1" 500 21 "-" "-" 36 "teleport-cluster-<redacted>@kubernetes" "https://<redacted>:3080" 6ms

Im running Traefik v2.9.9 fwiw.

webvictim commented 1 year ago

@dennisme Hmm, I think it might be a bug: https://github.com/traefik/traefik/issues/7465

You could try forcing Traefik to speak HTTP/1.1 instead of HTTP/2 to the backend and see if that works.

bittu664 commented 1 year ago

@webvictim but when i was using without ingress i can easily do exec in the pods, but after this type of configuration i cannot, any proper solutions for that. or it does not work in this particular config whatever we did.

bittu664 commented 1 year ago

@webvictim i tried just now , even without ingress its not working on 13.0.0 and 13.0.2 so is there any changes for this version why we cannot exec into pods. Screenshot 2023-05-19 at 9 54 08 PM

The main reason is i need to create a admin user so without exec into pods how i can do that ?

webvictim commented 1 year ago

Your kubectl command is ordered wrongly. It should be kubectl -n teleport-cluster exec -it teleport-cluster-proxy-775d9df6cf-ccxn6 -- /bin/bash

bittu664 commented 1 year ago

@webvictim its doesnt any impact, still i cannot exec into pod see this :-

Screenshot 2023-05-20 at 1 48 11 AM
webvictim commented 1 year ago

The stock Teleport image is now distroless, which means it doesn't have /bin/bash any more.

See here: https://github.com/gravitational/teleport/blob/8e39814aaff154907c2029a8b081bfdecba6ec0f/examples/chart/teleport-cluster/values.yaml#L442-L446

For now, you can override the image in the chart values to use the old Ubuntu-based image and run helm upgrade:

image: public.ecr.aws/gravitational/teleport
enterpriseImage: public.ecr.aws/gravitational/teleport-ent

These will be going away as of Teleport 14, so there is also a debug version of the distroless images which contains busybox for debugging:

image: public.ecr.aws/gravitational/teleport-distroless-debug
enterpriseImage: public.ecr.aws/gravitational/teleport-ent-distroless-debug
bittu664 commented 1 year ago

yes after using this inmage public.ecr.aws/gravitational/teleport now i can exec into pods by using kubectl , but why i cannot exec through rancher? is there any specific reason.

Screenshot 2023-05-20 at 11 31 12 PM
dennisme commented 1 year ago

@dennisme Hmm, I think it might be a bug: traefik/traefik#7465

You could try forcing Traefik to speak HTTP/1.1 instead of HTTP/2 to the backend and see if that works.

Thanks @webvictim for looking.

Yep. After trying this its still failing with 500 errors. Sounds like are options are to remove LB and swap LBs. After reverting to the non working previous config noticed these errors in the teleport-cluster logs.

We are going to remove the LB / ingress and try that setup.

2023-05-22T17:11:20Z WARN [MX:PROXY:] "\nERROR REPORT:\nOriginal Error: *trace.BadParameterError failed to detect connection protocol, first few bytes were: []byte{0x3, 0x0, 0x0, 0x13, 0xe, 0xe0, 0x0, 0x0}\nStack Trace:\n\tgithub.com/gravitational/teleport/lib/multiplexer/multiplexer.go:653 github.com/gravitational/teleport/lib/multiplexer.detectProto\n\tgithub.com/gravitational/teleport/lib/multiplexer/multiplexer.go:411 github.com/gravitational/teleport/lib/multiplexer.(*Mux).detect\n\tgithub.com/gravitational/teleport/lib/multiplexer/multiplexer.go:267 github.com/gravitational/teleport/lib/multiplexer.(*Mux).detectAndForward\n\truntime/asm_amd64.s:1598 runtime.goexit\nUser Message: failed to detect connection protocol, first few bytes were: []byte{0x3, 0x0, 0x0, 0x13, 0xe, 0xe0, 0x0, 0x0}" loglimit/loglimit.go:159
webvictim commented 1 year ago

@bittu664 I have no idea how Rancher is connecting to Kubernetes here or what path it uses to connect to containers. As you've shown, kubectl exec itself outside of Rancher works. You'd need to look in Rancher's logs to figure out what error it's getting and see where to go from here.

bittu664 commented 1 year ago

ok , one more thing, i want to store the session recording in our s3 bucket which is in wasabi , but according to your docs it didn't mention that where to edit

Screenshot 2023-05-25 at 12 55 19 AM

https://goteleport.com/docs/reference/backends/#configuring-the-s3-backend

webvictim commented 1 year ago

Add something like this to your values and helm upgrade:

auth:
  teleportConfig:
    teleport:
      storage:
        region: wasabi-region
        audit_sessions_uri: "s3://bucket-name/path?region=wasabi-region&endpoint=wasabi.example.com&insecure=true&disablesse=true"

You'll need to figure out what the exact values are to put here based on your Wasabi configuration.

You'll also need to provide some Wasabi credentials to the pod, most likely by creating a Kubernetes secret and mounting them into the pod as the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY: https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/#define-container-environment-variables-using-secret-data

Alternatively you could mount a /root/.aws/credentials file with aws_access_key_id and aws_secret_access_key settings into the pod.

bittu664 commented 1 year ago

@webvictim can i also direct edit inside the pods like this :-

Screenshot 2023-05-26 at 12 05 46 AM

i am inside the auth pod:-

Screenshot 2023-05-26 at 12 14 30 AM

This is the main file right teleport.yaml WhatsApp Image 2023-05-26 at 12 12 33 AM

webvictim commented 1 year ago

You can't edit the config file inside the pods as it's mounted read-only from a ConfigMap. You should use the method I described above to make persistent changes.

bittu664 commented 1 year ago

i tried your above helm values but it doesnt work, can you provide me the full helm chart values. this is my currently values file:-

teleport:
  storage:
      # The region setting sets the default AWS region for all AWS services
      # Teleport may consume (DynamoDB, S3)
      region: us-east-1

      # Path to S3 bucket to store the recorded sessions in.
      audit_sessions_uri: "s3://teleport-audit/recording?region=us-east-1&endpoint=s3.eu-central-2.wasabisys.com&insecure=true&disablesse=true"

and after that i used this command to upgrade but it didnt work :-

helm upgrade teleport-cluster teleport/teleport-cluster --set clusterName=secure.mydomain --set persistence.volumeSize=30Gi --set acme=true --set acmeEmail=cloud@gmail.com   --version 13.0.2 -f ./teleport.yaml -n teleport-cluster
webvictim commented 1 year ago

Re-read the values file I linked in my comment, it looks different to yours:

https://github.com/gravitational/teleport/issues/26130#issuecomment-1561894957

If you use that format it should work.

bittu664 commented 1 year ago

hello @webvictim thanks for this , its working now, now i can see my reccording session in my wasabi bucket. but why i cannot play this.

Screenshot 2023-05-26 at 11 03 56 PM

Screenshot 2023-05-26 at 11 12 20 PM

zmb3 commented 1 year ago

Sounds like you've made some progress here.

In the future, please direct support requests like this to a GitHub discussion or to our community Slack workspace.

GitHub issues are best for tracking bugs that require dev work and for which you have specific steps to reproduce, not general help and support.