giantswarm / roadmap

Giant Swarm Product Roadmap
https://github.com/orgs/giantswarm/projects/273
Apache License 2.0
3 stars 0 forks source link

Extend Teleport to OAuth2 apps (Grafana, Happa, Prometheus, etc.) #3035

Open gawertm opened 7 months ago

gawertm commented 7 months ago
### Tasks
- [ ] https://github.com/giantswarm/roadmap/issues/3219
- [ ] https://github.com/giantswarm/roadmap/issues/3220
### Blockers
- [ ] Teleport integration with Happa API for Kubernetes API access. See more details [here](https://gigantic.slack.com/archives/C053JHJC99Q/p1708517563819189) and [here](https://gigantic.slack.com/archives/C02GDJJ68Q1/p1708508263922009?thread_ts=1706692097.776539&cid=C02GDJJ68Q1)
- [ ] Grafana Loki DS
- [ ] Dex integration
tuladhar commented 6 months ago

Discusssion topics:

tuladhar commented 6 months ago

Prometheus UI access via Teleport:

Access flow (without nginx ingress/dex in the mix):]

User -> Teleport Cluster (login)
         -> Teleport App Agent (running on golem)
             -> Prometheus internal svc endpoint

Route53 entry (manual for now, until we have wildcard entry *.teleport.giantswarm.io):

prometheus-golem.teleport.giantswarm.io -- points --> Teleport Cluster Proxy ALB

Here's the teleport app config:

> cat prometheus-golem-app.yaml
kind: app
version: v3
metadata:
  name: prometheus-golem
  description: "Prometheus - golem"
  labels:
    env: test
spec:
  uri: http://prometheus-operated.golem-prometheus.svc.cluster.local:9090
tuladhar commented 6 months ago

Document explaining Prometheus access setup:

tuladhar commented 6 months ago

Image

tuladhar commented 6 months ago

Plan B worked - grafana-golem.teleport.giantswarm.io

Plan A didn’t work, and is not backward compatible with customer, as there are breaking changes when setting up dex behind Teleport.

Why Plan A didn't work?

Image

Image

How Plan B worked?

anvddriesch commented 5 months ago

Since customers dex access can not be affected by either scenario, we are unable to change settings like the OIDC issuer flags on the kubernetes API server or the oauth provider flags for grafana. (both allow only one issuer) This is not an issue in scenario B since we can use the jwt token directly and bypass dex completely. However, it is an issue in scenario A for the reasons described in Purus comment. The limitation on issuers is the reason we were using dex in the first place. We may have a bit more freedom with 1.29 structured auth config options but we have not tested this yet. As of right now, running dex inside teleport to make it available to us does not seem possible if we don't also want to make dex unavailable to customers. We already have a limitation on reproducing dex issues due to identity provider configuration on the customers side. Bypassing dex and removing the VPN will grow this limitation since dex will become unavailable to us. However, since we already need to debug in tandem with customers it may not be a huge change.

tuladhar commented 5 months ago

Honeybadger Feedback (Happa)

tuladhar commented 5 months ago

Atlas Feedback (Grafana/Loki)

tuladhar commented 5 months ago

Atlas - Challenge #1: Loki DS uses oAuth token forward

tuladhar commented 5 months ago

Honeybadger - Challenge #1 - Happa can't accept Teleport Jwt token

Honeybadger - Challenge #2 - Athena /graphql endpoint

Image

Image

tuladhar commented 5 months ago

Discussion with Honeybadger

Slides: https://docs.google.com/presentation/d/1OLtCijqp6VM1Xu2p0sxvyr1d2NJD6wHK0ypQ1mzRRCA/edit?usp=sharing

Action Items:

tuladhar commented 5 months ago

Discussion with Dmitry

Attendees: Puru, Spyros, Antonia, Pawel, Dmitry

Action Items:

tuladhar commented 4 months ago

SOCKS5 proxy as an alternative

@gawertm It seems we can leverage the SOCKS5 proxy feature of Teleport SSH to access internal web apps. I have documented the setup process in intranet. I posted the demo in our channel.

gawertm commented 4 months ago

@tuladhar thanks, I just saw that. Can you draw some network diagram how the traffic flows with the socks5 proxy? I want to fully understand :) and also about this:

Only use proxy for accessing internal web apps, because potentially, we can use it to browse other external websites.

tuladhar commented 4 months ago

@gawertm Here's the network diagram. Let me know if this doesn't make sense :) Screenshot 2024-02-29 at 9 56 41 PM

Only use proxy for accessing internal web apps, because potentially, we can use it to browse other external websites.

Yes, think of SOCKS5 proxy as VPN through SSH connection. So, any network access the node where SOCKS5 is established on, we can access them too, which means access to the internet if node has access to it along with internal network access.

gawertm commented 4 months ago

ok understood, thanks! lets discuss it in the refinement today :)

gawertm commented 4 months ago

we came to the conclusion, that this solution is not user friendly enough and too much config is needed upfront. While it is a good emergency access solution, we will still keep the VPN for App access in private installation. to be confirmed with @alex-dabija. Also we will revisit with kubernetes 1.29 and structured auth config

alex-dabija commented 4 months ago

We agreed in SIG Architecture to keep the the VPN for now.