department-of-veterans-affairs / va.gov-team

Public resources for building on and in support of VA.gov. Visit complete Knowledge Hub:
https://depo-platform-documentation.scrollhelp.site/index.html
277 stars 194 forks source link

[Discovery] Evaluate Traefik Enterprise for ingress to EKS #43357

Open jhouse-solvd opened 2 years ago

jhouse-solvd commented 2 years ago

Description

The Integration Experience Team (IET) would like to explore the possibility of using Traefik Enterprise (TEE) for ingress to Kubernetes clusters (EKS). There is functionality that TEE provides that would be useful for a number of reasons.

Background/context

"For summary, the complexity of OIDC is high enough to offload it from the application and microservices development teams and choose an enterprise-grade implementation on the network entry point, i.e. TraefikEE on Kubernetes ingress."

Technical notes


Tasks

Acceptance Criteria

jhouse-solvd commented 2 years ago

@ewilson-adhoc & @considerable - Please share relevant info that explains your use case as well as any research that you've already done into Traefik Enterprise.

considerable commented 2 years ago

Hello @jhouse-solvd - we have the task to integrate the microservices with Keycloak for authN & authZ ; the most modern vendor-independent standard for both authN & authZ is OpenID Connect (OIDC); all the reasoning for TraefikEE+OIDC middleware to work with Keycloak is in this RFC - https://github.com/department-of-veterans-affairs/va.gov-platform-architecture/blob/IET_002/rfc/2022/2022-06-15_IET_002_TraefikEE-Keycloak-SSO-on-Kubernetes-Ingress.md This blog shows how to setup TraefikEE with Keycloak TraefikEE blog: Bench Testing OpenID Connect Authentication in Traefik Enterprise. We'd like try such as setup on EKS environment Dev.

considerable commented 2 years ago

The target application that will most brightly demonstrate how to use Traefik/TraefikEE labels to bring an old style unprotected container developed by a 3-rd party to the modern security standards is PgHero that might need only one manifest file changed - https://github.com/department-of-veterans-affairs/vsp-infra-application-manifests/blob/main/apps/vsp-tools-backend/vets-api-pghero/dev/basic-auth-middleware.yaml

considerable commented 2 years ago

I attempted to upload a file with steps that worked for me testing traefik-enterprise on my Mac with Docker swarm, but the upload won't work. So here is the content of the step-by-step, if it helps:

Install Traefik Enterprise on Mac with Docker Desktop

per https://doc.traefik.io/traefik-enterprise/getting-started/

chmod +x ~/Downloads/teectl_v2.6.4_darwin_amd64/teectl 
sudo mv ~/Downloads/teectl_v2.6.4_darwin_amd64/teectl /usr/local/bin
teectl --help

per https://docs.docker.com/engine/swarm/swarm-tutorial/create-swarm/

If you are using Docker Desktop for Mac or Docker Desktop for Windows to test single-node swarm, simply run docker swarm init with no arguments

docker swarm leave --force
sleep 5
docker swarm init

Swarm initialized: current node (48r2gb3nfw3gbbr3ljxneaujv) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-34brfnpp98wxdeb4gavtw99odk3uaos7ylpmd89vaibht5zk27-4d0r01sh9hn2an0353icf9g7u 192.168.65.3:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

docker info | grep -i swarm

Swarm: active

docker node ls

ID                            HOSTNAME         STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
48r2gb3nfw3gbbr3ljxneaujv *   docker-desktop   Ready     Active         Leader           20.10.14

Generates ./bundle.zip

export TRAEFIKEE_LICENSE=<trial-license>
teectl setup \
  --swarm \
  --swarm.hosts \
  `ifconfig en0|grep -m1 -oE '([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)'|head -1` \
  --force
grep -A2 hosts ~/.config/traefikee/default.yaml

The bundle file "./bundle.zip" has been created. It contains the certificates needed to start a Traefik Enterprise controller.
Platform specific YAML files can be generated using the "teectl setup gen" command.

Deploy controllers

teectl setup gen \
--license="${TRAEFIKEE_LICENSE}" \
--controllers=1 | docker stack deploy -c - traefikee

Additional information can be found at: https://doc.traefik.io/traefik-enterprise/v2.6/operations/introduction/
Creating network traefikee_control
Creating secret traefikee_bundle
Creating service traefikee_controller-0
Creating service traefikee_api-proxies

Verify that your TraefikEE installation is ready

teectl get nodes

ID                         NAME          STATUS  ROLE
fzwqgsab6xzy2dxl7hdca5ya6  controller-0  Ready   Controller (Leader)

Wait for controllers initialization

while [ $(docker config ls -qf name==default-proxy) -lt 1 ]; do sleep 1; done

Deploy proxies

teectl setup gen \
--license="${TRAEFIKEE_LICENSE}" \
--proxies=1 | docker stack deploy -c - traefikee

Additional information can be found at: https://doc.traefik.io/traefik-enterprise/v2.6/operations/introduction/
Creating network traefikee_ingress
Creating service traefikee_registry
Creating service traefikee_proxies

teectl get nodes

ID                         NAME          STATUS  ROLE
ID                         NAME          STATUS  ROLE
43q8vss7iwcbu9g0jgrxz8vv6                Ready   Plugin Registry
p8to53aq5jct7oc4jv4z8u2b7  b131c38ab00c  Ready   Proxy / Ingress
wmwaok1q0o3x5sja3jttlkyou  controller-0  Ready   Controller (Leader)

Configuration

First, apply the static configuration:

cat > traefikee-static.yaml << EOF
--- 
entryPoints:
  web:
    address: ":80"
  #websecure:
  #  address: ":443"
  #traefik:
  #  address: ":8080"

api:
  insecure: true
  dashboard: true
  debug: true

log:
  level: info

accessLog:
  format: json

providers:
  docker:
    swarmMode: true

# https://traefik.io/blog/testing-oidc-authentication-traefik-enterprise/
sessionStorages:
  redisStore:
    redis:
      endpoints:
        - redis-test.vsp.local:6379

# https://doc.traefik.io/traefik-enterprise/v2.6/middlewares/oidc/#openid-connect-authentication-middleware
authSources:
  oidcSource:
    oidc:
      issuer: http://host.docker.internal:2080/auth/realms/master   # grep internal /etc/hosts
      clientID: traefikee-test-client
      clientSecret: 94085118-ff5a-426f-bd35-bc49e762dcc5
EOF

teectl apply --file=traefikee-static.yaml

Then, apply the following dynamic configuration:

cat > traefikee-dynamic.yaml << EOF
---
http:
  routers:
    api:
      rule: "Host(`traefikee-dashboard.vsp.local`)"
      service: api@internal
      entryPoints:
        - traefik
      middlewares:
        - basic-auth
  middlewares:
    # https://doc.traefik.io/traefik/v2.6/middlewares/http/basicauth/
    basic-auth:
      basicAuth:
        users:
          - "test:$apr1$yYDqsBl1$lg5EJXEgVPluhrS1nJ8mj/"    # echo $(htpasswd -nb test test) 
    # https://traefik.io/blog/testing-oidc-authentication-traefik-enterpris/
    # https://doc.traefik.io/traefik-enterprise/providers/traefikee/#applying-a-traefik-enterprise-provider-configuration
    # https://doc.traefik.io/traefik-enterprise/v2.6/middlewares/oidc/#openid-connect-authentication-middleware
    oidc-auth:
      plugin:
        oidcAuth:
          source: oidcSource
          # https://medium.com/@panda1100/keycloak-as-oidc-provider-for-harbor-c25906481619
          # https://traefik.io/glossary/openid-connect-everything-you-need-to-know/
          # https://www.janua.fr/oauth2-openid-scope-usage-with-keycloak/
          scopes: 
            - openid
            - microprofile-jwt
          redirectUrl: "/callback"
          session:
            secret: BjwWlkTrBAsX0u6hkTp7cw==  # openssl rand -base64 16
            #store: redisStore
            expiry: 3600
          forwardHeaders:
            X-Forwarded-User: email
            X-Forwarded-Group: groups

EOF

teectl apply --file=traefikee-dynamic.yaml

Inspect

docker service ls                
ID             NAME                     MODE         REPLICAS   IMAGE                      PORTS
zw8b26f551s9   traefikee_api-proxies    replicated   2/2        traefik/traefikee:v2.6.4   *:55055->55055/tcp
m9b01008m05v   traefikee_controller-0   replicated   1/1        traefik/traefikee:v2.6.4   
7547dngkg6ac   traefikee_proxies        replicated   1/1        traefik/traefikee:v2.6.4   *:80->80/tcp, *:443->443/tcp
ofg7elnb5hah   traefikee_registry       replicated   1/1        traefik/traefikee:v2.6.4     

teectl get nodes
ID                         NAME          STATUS  ROLE
43q8vss7iwcbu9g0jgrxz8vv6                Ready   Plugin Registry
p8to53aq5jct7oc4jv4z8u2b7  b131c38ab00c  Ready   Proxy / Ingress
wmwaok1q0o3x5sja3jttlkyou  controller-0  Ready   Controller (Leader) 
cat > traefikee-whoami.yaml << EOF
---
# https://doc.traefik.io/traefik-enterprise/installing/swarm/#deploying-a-test-service
#
version: '3.4'
networks:
  traefikee_ingress:
    external: true

services:
  whoami:
    image: traefik/whoami:v1.6.1
    deploy:
      mode: replicated
      replicas: 1
      labels:
        - "traefik.http.services.whoami.loadbalancer.server.port=80"
        - "traefik.http.routers.whoami.rule=Host(`traefikee-test.vsp.local`)"
        #- "traefik.http.routers.whoami.middlewares=basic-auth@traefikee"
        - "traefik.http.routers.whoami.middlewares=oidc-auth@traefikee"
        # the middleware can check the claims returned from the IdP ID token to make sure users are authorized
        - "traefik.http.middlewares.oidc-auth.plugin.oidcAuth.claims=Contains(`groups`, `uma_authorization`)"
    networks:
      - traefikee_ingress

EOF

docker stack deploy -c ./traefikee-whoami.yaml traefikee

The next is a smoke test with curl CLI

curl -sI http://traefikee-test.vsp.local/whoami

The next will launch your browser on Mac

open http://traefikee-test.vsp.local/whoami
docker stack deploy -c ./traefikee-redis.yaml traefikee
teectl apply --file=traefikee-static.yaml
teectl apply --file=traefikee-dynamic.yaml
docker stack deploy -c ./traefikee-whoami.yaml traefikee
curl -s http://traefikee-test.vsp.local/whoami
open http://traefikee-test.vsp.local/whoami

Keycloak

https://cloud.redhat.com/blog/adding-authentication-to-your-kubernetes-web-applications-with-keycloak

Starting over

Uninstall Traefik Enterprise on Docker Swarm

For a cluster installed on Docker Swarm, Traefik Enterprise can be uninstalled using the following commands:

docker stack ls 
docker config ls
docker stack rm traefikee
docker config rm default-controller default-proxy
sleep 5
docker stack ls 
docker config ls

Then, remove Traefik Enterprise volumes on each node:

docker volume rm $(docker volume ls --filter="label=com.docker.stack.namespace=traefikee" -q)
considerable commented 2 years ago

These are the emails I received from Traefik Labs regarding the trial license.

I did not reply any message, but this might be a good starting point for the technical details and the pricing:

From: Zeina Elmokadem zeina.elmokadem@traefik.io Date: June 24, 2022 at 1:40:07 PM MDT To: Hide My Email Subject: Traefik // Ad Hoc

Hi Igor,

I wanted to bump this back to the top of your inbox and make sure we connect to take care of your experience with the trial.

Are you interested in discussing your networking/ routing project with Traefik Labs in order to help out?

Please take a look at my schedule and find a time that works to discuss.

Looking forward to hearing from you!

Best,​

Zeina Elmokadem zeina.elmokadem@traefik.io traefik.io

June 21, 2022 at 00:48:08 -0700, Zeina Elmokadem zeina.elmokadem@traefik.io:

Hi Igor,

I hope all is well.

I am reaching out following your trial subscription to discuss and make sure you are getting the most ​out of Traefik trial. I'd love to know your feedback and hope that you found Traefik useful to your project

Do you have a project and you are interested in improving your experience with Traefik Labs?

Please select the time frame that best fits your availability here.

Looking forward to hearing from you!

Zeina Elmokadem zeina.elmokadem@traefik.io traefik.io

jhouse-solvd commented 2 years ago

@considerable - Thanks for the detailed information. I am going to propose that the team take a look at this during the next sprint. After we've had a chance to discuss it as a team, we'll follow up with the next steps.

considerable commented 1 year ago

@jhouse-solvd, please take a look. This is one of the comments posted to RFC about TraefikEE that mentions TRM requirement that we did not discuss much :

@jchapin Neither Traefik nor TraefikEE show up in the current publicly accessible version of the VA TRM.

https://www.oit.va.gov/Services/TRM/TRMHomePage.aspx

The TRM presents a list of assessed technologies and standards used to develop, operate, and maintain enterprise applications. Entries on this list have undergone a strategic assessment based upon the nature of the technology. The TRM entry contains guidance, along with any known applicable constraints, on the permissible range of technologies or standards that a VA user, Office of Information and Technology (OIT) administration support team or Project Development Team may select or shall use. The TRM is not intended to direct procurements, although each entry contains available VA licensing information, if known. Requests for an assessment of a technology or standard can be submitted through the TRM tool and will be assessed by subject matter experts (SME's) of the TRM Management Group.

Would it be appropriate to submit a request for an assessment now or after the RFC's acceptance?

jhouse-solvd commented 1 year ago

Comment from Kyle during discussion w/ the team on 7/28/22: Pricing may be based on traffic volume. There may be information in Grafana that is useful to that exploration.

ewilson-adhoc commented 1 year ago

@jhouse-solvd my team built out a dashboard that breaks down traffic volume by controller -- not sure if this would be helpful https://grafana.vfs.va.gov/d/iPlfSzkVk/va-pst-iet?orgId=1&from=now-24h&to=now

jhouse-solvd commented 1 year ago

Moving this to the backlog for re-prioritization.