StatCan / aaw

Documentation for the Advanced Analytics Workspace Platform
https://statcan.github.io/aaw/
Other
69 stars 12 forks source link

PETlab project in AAW #585

Open blairdrummond opened 3 years ago

blairdrummond commented 3 years ago

From @ben-santos

Hi all, we recently got the approval for onboarding PETlab project in AAW... basically we will act as a node (domain node) hosting some public data. I was expecting to deploy this in a VM with some TCP/UDP ports exposed to the internet. We will expect receiving the instructions for deployment like this

apiVersion: v1
kind: Service
metadata:
  name: grid-domain
  labels:
    component: grid-domain
spec:
  selector:
    component: grid-domain
  ports:
  - port: 5000
    targetPort: 5000
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grid-domain
  # annotations:
  #   linkerd.io/inject: enabled
  labels:
    component: grid-domain
spec:
  replicas: 1
  selector:
    matchLabels:
      component: grid-domain
  template:
    metadata:
      labels:
        component: grid-domain
    spec:
      containers:
      - name: grid-domain
        imagePullPolicy: Always
        image: openmined/grid-domain:latest
        envFrom:
        - configMapRef:
            name: common-config
        - secretRef:
            name: common-secret
        ports:
        - containerPort: 5000

@Ben Santos that image seems to have 5 critical cves atm

➜  ~ trivy openmined/grid-domain                                                                                                                                                                                                                                             
2021-07-08T15:33:04.136-0400    INFO    Need to update DB                                                                                                                                                                                                                    
2021-07-08T15:33:04.136-0400    INFO    Downloading DB...                                                                                                                                                                                                                    
22.32 MiB / 22.32 MiB [-------------------------------] 100.00% 36.77 MiB p/s 1s                                                                                                                                                                                             
2021-07-08T15:33:29.520-0400    INFO    Detected OS: debian                                                                                                                                                                                                                  
2021-07-08T15:33:29.520-0400    INFO    Detecting Debian vulnerabilities...                                                                                                                                                                                                  
2021-07-08T15:33:29.544-0400    INFO    Number of PL dependency files: 1                                                                                                                                                                                                     
2021-07-08T15:33:29.544-0400    INFO    Detecting poetry vulnerabilities...                                                                                                                                                                                                  
openmined/grid-domain (debian 10.10)                                                                                                                                                                                                                                         
====================================                                                                                                                                                                                                                                         
Total: 683 (UNKNOWN: 0, LOW: 576, MEDIUM: 54, HIGH: 48, CRITICAL: 5)                          

One is a pretty recent glibc vulnerability, which will hopefully get patched soon. Also the docker image runs as root

Project here https://github.com/OpenMined/PyGrid/blob/dev/apps/domain/Dockerfile

They were also given a Postgres thing

apiVersion: v1
kind: Service
metadata:
  name: postgres
  labels:
    app: postgres
spec:
  ports:
  - port: 5432
    name: postgres
  clusterIP: None
  selector:
    component: postgres
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  # annotations:
  #   linkerd.io/inject: enabled
  labels:
    component: postgres
spec:
  serviceName: "postgres"
  replicas: 1
  selector:
    matchLabels:
      component: postgres
  template:
    metadata:
      labels:
        component: postgres
    spec:
      containers:
      - name: postgres
        imagePullPolicy: Always
        image: postgres:13.3
        ports: 
        - containerPort: 5432
        envFrom:
        - configMapRef:
            name: common-config
        - secretRef:
            name: common-secret
        env:
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        volumeMounts:
        - mountPath: /var/lib/postgresql/data
          name: postgres-volume
  volumeClaimTemplates:
  - metadata:
      name: postgres-volume
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: standard
      resources:
        requests: 
          storage: 5Gi
blairdrummond commented 3 years ago

CC @brendangadd , this is from a newish usecase; I think @ben-santos wants to make this service accessible to governments/institutions overseas as a proof-of-concept

ben-santos commented 3 years ago

Thanks @blairdrummond , here is an extract of the proposal of one of the goals we want to accomplish:

"Each member of the task team who works at/with an NSO will secure a server which can be attached to a public network (can be a cloud machine). The UN Global Platform will also procure a machine to facilitate network services between NSO machines where relevant. We will then, on an ongoing basis, coordinate experiments on public data using a variety of privacy enhancing technology software stacks. It is our goal to use these 0-risk (public data, single-machine, separate from any secure networks) experiments to increase awareness and certainty around what current PET technologies are capable of in the context of NSO-relevant use cases. We want to go through the exercise of working with private data without needing to actually work with private data so that we can all learn more about the constraints of such systems relevant to statistical use cases NSOs care about."

The yamls above are the first attempts to deploy such service. They are provided by the UN (network node) and OpenMined. As a domain node we will host some public data. This is a WIP they are trying to test this with another NSO (I think ONS).

ben-santos commented 3 years ago

I communicated our concerns about the vulnerabilities on this image. I'll keep you posted.

ben-santos commented 3 years ago

@blairdrummond UN-OpenMined changed the images and the services. I was told that now they splitted the services into 7 containers wrapped on a VM. They are working on a one command line deployment... I do not have the details yet. They are willing to remediate the issues.

Thanks Blair, I think neither of us knew about trivy.

They are pushing this for next week to start the deployment. I told that your team has the last word for the approval for the images and instructions.

ben-santos commented 3 years ago

I requested invitations to the repo and slack channel for @sylus @brendangadd and @blairdrummond OpenMined's IT expert is in Brisbane, so to schedule a meeting will be difficult. It has to be around 5pm (7am there) because there is another collaborator in the UK. they suggest as early as possible in the morning... please @sylus @brendangadd @blairdrummond let me know what do you think...