StatCan / aaw

Documentation for the Advanced Analytics Workspace Platform
https://statcan.github.io/aaw/
Other
69 stars 12 forks source link

Create prot-b trino instance #1339

Closed rohank07 closed 2 years ago

rohank07 commented 2 years ago

To ensure a prot-b data source can only be accessed by a prot-b notebook a new Trino instance will need to be created which has seperate coordinators and workers in a "trino-pro-b" namespace which has network isolation to only communicate with prot-b notebooks/pods

http://service-name.namespace-name.svc.cluster.local:8080/

rohank07 commented 2 years ago

Created trino-protb-system namespace via terraform. Encountering issues trying to curl the fqdn/internal ip address of the trino service. curl -v -H "x-forwarded-proto: https" http://trino.trino-system.svc:8080 Curling from a notebook gives a 503 Service unavailable

Turns out, it was a missing egress network policy from the notebook to the trino-system namespace.

rohank07 commented 2 years ago

Added namespace labels: trino-namespace: protected-b (prob instance) trino-namespace: unclassified (unclassified instance) Network Policies to add:

apiVersion: [networking.k8s.io/v1](http://networking.k8s.io/v1)
kind: NetworkPolicy
metadata:
  name: allow-egress-protb-notebook-to-trino-protb-system
  namespace: user-namespace
spec:
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          trino-namespace: protected-b
      podSelector:
        matchLabels:
          component: coordinator
  podSelector:
    matchLabels:
      [data.statcan.gc.ca/classification](http://data.statcan.gc.ca/classification): protected-b
  policyTypes:
  - Egress
---
apiVersion: [networking.k8s.io/v1](http://networking.k8s.io/v1)
kind: NetworkPolicy
metadata:
  name: allow-ingress-trino-protb-system-from-protb-notebook
  namespace: trino-protb-system
spec:
  ingress:
    - from:
        - podSelector:
            matchLabels:
              [data.statcan.gc.ca/classification](http://data.statcan.gc.ca/classification): protected-b
  podSelector:
    matchLabels:
      release: trino-protb
  policyTypes:
  - Ingress
---
apiVersion: [networking.k8s.io/v1](http://networking.k8s.io/v1)
kind: NetworkPolicy
metadata:
  name: allow-ingress-trino-system-from-notebook
  namespace: trino-system
spec:
  ingress:
    - from:
      - podSelector:
          matchExpressions:
            - key: notebook-name
              operator: Exists
  podSelector:
    matchLabels:
      release: trino
  policyTypes:
  - Ingress
---
apiVersion: [networking.k8s.io/v1](http://networking.k8s.io/v1)
kind: NetworkPolicy
metadata:
  name: allow-egress-notebook-to-trino-system
  namespace: user-namespace
spec:
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          trino-namespace: unclassified
  podSelector:
    matchExpressions:
      - key: notebook-name
        operator: Exists
  policyTypes:
  - Egressss
rohank07 commented 2 years ago

Refactored controllers: https://github.com/StatCan/aaw-kubeflow-profiles-controller/pull/71 Added notebook netpols: https://github.com/StatCan/aaw-kubeflow-profiles-controller/pull/72

rohank07 commented 2 years ago

Since the trino rest api requires https, switched to exposing the protb instance using a virutal service. But network policies are not able to deny traffic from the gateway. Decided to temporarily remove the ingress gateway and automation of creating protb schemas