wiktorn / Overpass-API

Overpass API docker image
MIT License
140 stars 49 forks source link

Deploy to Kubernetes? #8

Closed nathanwaters closed 5 years ago

nathanwaters commented 5 years ago

Hi

I'm trying to deploy this to (google) kubernetes but it's not working. Any ideas on what I need to change in this yaml?

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: overpass
spec:
  template:
    metadata:
      labels:
        name: overpass
    spec:
      containers:
        - name: overpass
          image: wiktorn/overpass-api
          volumeMounts:
          - mountPath: /db
            name: overpass-volume
          readinessProbe:
            httpGet:
              path: /interpreter
              port: 80
          ports:
            - name: overpass
              containerPort: 80
          env:
          - name: OVERPASS_META
            value: "no"
          - name: OVERPASS_MODE
            value: "init"
          - name: OVERPASS_PLANET_URL
            value: "http://download.geofabrik.de/australia-oceania/australia-latest.osm.bz2"
          - name: OVERPASS_DIFF_URL
            value: "http://download.openstreetmap.fr/replication/oceania/australia/minute/"
          - name: OVERPASS_PLANET_SEQUENCE_ID
            value: "3325745"
          - name: OVERPASS_RULES_LOAD
            value: "10"
      volumes:
      - name: overpass-volume
        emptyDir: {}
wiktorn commented 5 years ago

Can you share the error that you get? I don't have any k8s installation at hand right now.

nathanwaters commented 5 years ago

Here's the current errors. I've bumped up resource limits to 20GB RAM and 8 CPUs. And also added --flush-size=1 to the init_osm3.sh command.

Been having issues with every Overpass API docker variant I can find. They all build fine locally and build fine on gcloud. But deploying to Kubernetes they all fail :(

E  Reading XML file ...

E  bunzip2: I/O or other error, bailing out.  Possible reason follows.

E  bunzip2: Cannot allocate memory

E  Input file = (stdin), output file = (stdout)

E  Parse error at line 8215853:

E  unclosed token

E  Reading XML file .../app/bin/init_osm3s.sh: line 44: 10 Broken pipe  bunzip2 < $PLANET_FILE

E  11 Killed | $EXEC_DIR/bin/update_database --db-dir=$DB_DIR/ $META $COMPRESSION

E  Error: Format string '/app/bin/fetch_osc.sh auto %(ENV_OVERPASS_DIFF_URL)s /db/diffs' for 'program:fetch_diff.command' contains names ('ENV_OVERPASS_DIFF_URL') which cannot be expanded. Available names: ENV_GEOROUTE_PORT, ENV_GEOROUTE_PORT_80_TCP, ENV_GEOROUTE_PORT_80_TCP_ADDR, ENV_GEOROUTE_PORT_80_TCP_PORT, ENV_GEOROUTE_PORT_80_TCP_PROTO, ENV_GEOROUTE_SERVICE_HOST, ENV_GEOROUTE_SERVICE_PORT, ENV_HOME, ENV_HOSTNAME, ENV_KUBERNETES_PORT, ENV_KUBERNETES_PORT_443_TCP, ENV_KUBERNETES_PORT_443_TCP_ADDR, ENV_KUBERNETES_PORT_443_TCP_PORT, ENV_KUBERNETES_PORT_443_TCP_PROTO, ENV_KUBERNETES_SERVICE_HOST, ENV_KUBERNETES_SERVICE_PORT, ENV_KUBERNETES_SERVICE_PORT_HTTPS, ENV_NGINX_VERSION, ENV_NJS_VERSION, ENV_OVERPASS_RULES_LOAD, ENV_PATH, ENV_PWD, ENV_SHLVL, ENV_WIKTORN_SERVICE_PORT, ENV_WIKTORN_SERVICE_PORT_80_TCP, ENV_WIKTORN_SERVICE_PORT_80_TCP_ADDR, ENV_WIKTORN_SERVICE_PORT_80_TCP_PORT, ENV_WIKTORN_SERVICE_PORT_80_TCP_PROTO, ENV_WIKTORN_SERVICE_SERVICE_HOST, ENV_WIKTORN_SERVICE_SERVICE_PORT, group_name, here, host_node_name, process_num, program_name in section 'program:fetch_diff' (file: '/etc/supervisor/conf.d/supervisord.conf')
wiktorn commented 5 years ago

Probably this is connected more to size of the data that you're trying to load and less - to the fact that you're running on K8S. Have you tried loading small extracts such as Monaco?

MichielDeMey commented 5 years ago

I managed to run the entire extract of Belgium on our Kubernetes cluster with only 2GB of memory allocated to the pod. (Takes about an hour to 2 hours)

resources:
  limits:
    memory: "2Gi"

I also added a small startup script that adds the --flush-size flag to the init script automatically. This way there's no need for any Dockerfile customization.

kind: ConfigMap
apiVersion: v1
metadata:
  name: overpass-flush-patch
data:
  flush_patch.sh: |-
    #!/bin/sh

    # Add custom flush to reduce initial memory usage
    echo "Patching /app/bin/init_osm3s.sh with custom --flush-size"
    sed -i.bck '$s/$/ --flush-size=1/' /app/bin/init_osm3s.sh

Then mount in your deployment file under /docker-entrypoint-initdb.d:

volumeMounts:
- name: flush-patch
  mountPath: /docker-entrypoint-initdb.d

And make sure it's executable (defaultMode: 0744):

volumes:
  - name: flush-patch
    configMap:
      name: overpass-flush-patch
      defaultMode: 0744
wiktorn commented 5 years ago

@MichielDeMey great news. I guess will pull up --flush-size as environment variable.

I've experience with deploying extract of Poland on Google Cloud Run. Without --flush-size I was using 7.5GB to initially load the database. But for running queries even 1GB was enough, though some more expensive ones failed to run even with 2GB of memory (which is max for GCR)

wiktorn commented 5 years ago

Added OVERPASS_FLUSH_SIZE in 89ada3f

joerg-walter-de commented 4 years ago

I managed to replicate a working system from the manifests above. I tried to replace the volume with a PersistentVolume and a claim. Not it seems that after the overpass pod has run its initialisation course and is restarted again the initialisation starts over. This process seems to go forever. I am new to PersistentVolume objects. May the reason be a configuration mistake in that area?

wiktorn commented 4 years ago

I can't find any errors in what you describe about PersistentVolume objects.

What you observe (container starting, initializing and then starting over) suggests, that on each start the container gets brand new volume. Can you change the policy to not restart automatically and inspect contents of the PV? Like if there is the file /db/init_done present?

You may also consider burining the data into the image itself and update it only by deploying new version of the container. It improves startup time (as K8S needs to download big image), but reduces greatly number of moving parts and gives you full control over when to update. Also it provides easy scale-out strategy.

joerg-walter-de commented 4 years ago

I have:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: overpass-db
spec:
  storageClassName: 'overpass-storageclass'
  capacity:
    storage: 200Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: '/overpass_db/europa/'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: overpass
  namespace: overpass
spec:
  replicas: 1
  selector:
    matchLabels:
      app: overpass
  template:
    metadata:
      labels:
        app: overpass
    spec:
      containers:
      - name: overpass
        image: wiktorn/overpass-api
        resources:
          limits:
            memory: 500Mi
            cpu: 100m
        volumeMounts:
        # - mountPath: /db
        #   name: overpass-volume
        - 
          mountPath: /overpass_db/europa/
          name: db
        - name: flush-patch
          mountPath: /docker-entrypoint-initdb.d
        readinessProbe:
          httpGet:
            path: /api/status
            port: 80
        ports:
          - name: overpass
            containerPort: 80
        env:
        - name: OVERPASS_META
          value: "no"
        - name: OVERPASS_MODE
          value: "init"
        - name: OVERPASS_PLANET_URL
          value: "http://download.geofabrik.de/europe/monaco-latest.osm.bz2"
        - name: OVERPASS_DIFF_URL
          value: "http://download.openstreetmap.fr/replication/europe/monaco/minute/"
        - name: OVERPASS_PLANET_SEQUENCE_ID
          value: "3325745"
        - name: OVERPASS_RULES_LOAD
          value: "10"
      volumes:
      # - name: overpass-volume
      #   emptyDir: {}
      - 
        name: db
        persistentVolumeClaim:
          claimName: db
      - name: flush-patch
        configMap:
          name: overpass-flush-patch
          defaultMode: 0744
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: overpass
  name: db
spec:
  storageClassName: 'overpass-storageclass'
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 200Gi

kind: ConfigMap
apiVersion: v1
metadata:
  name: overpass-flush-patch
  namespace: overpass
data:
  flush_patch.sh: |-
    #!/bin/sh

    # Add custom flush to reduce initial memory usage
    echo "Patching /app/bin/init_osm3s.sh with custom --flush-size"
    sed -i.bck '$s/$/ --flush-size=1/' /app/bin/init_osm3s.sh
apiVersion: v1
kind: Service
metadata:
  name: overpass
  namespace: overpass
spec:
  type: ClusterIP
  ports:
  - port: 80
  selector:
    app: overpass

The volume doesn't seem to be recreated.

Can you change the policy to not restart automatically and inspect contents of the PV? Like if there is the file /db/init_done present?

Will do!

joerg-walter-de commented 4 years ago

I run into this:

https://github.com/kubernetes/kubernetes/issues/24725

joerg-walter-de commented 4 years ago

I get image and on the machine image

wiktorn commented 4 years ago

Can you tell me which version of the overpass image you're running? It looks really strange, as there is following snippet:

                && touch /db/init_done \
                && rm /db/planet.osm.bz2 \
                && chown -R overpass:overpass /db \
                && echo "Overpass ready, you can start your container with docker start" \
                && exit

And you have Overpass ready, you can start your container with docker start on the console, which means that touch /db/init_done executed successfully. On the other hand, when you show the contents of the image, this file is missing in /db directory. Also /db/planet.osm.bz2 is still there and should be removed according to above.

The /db/init_done sentinel file was added ~1 year ago.

What kind of storage you're using for Persistent Volumes? And is it possible that it looses "last minute changes", before the container shut down?

joerg-walter-de commented 4 years ago

Mind that the content of the dirs is after the unfortunate automatic restart of the container. So when it is deleted at restart for some reason it might have existed before the restart

joerg-walter-de commented 4 years ago

For the persistent storage see manifests above

joerg-walter-de commented 4 years ago

This is working:

apiVersion: v1
kind: Namespace
metadata:
  name: overpass
---
apiVersion: v1
data:
  flush_patch.sh: |-
    #!/bin/sh

    # Add custom flush to reduce initial memory usage
    echo "Patching /app/bin/init_osm3s.sh with custom --flush-size"
    sed -i.bck '$s/$/ --flush-size=1/' /app/bin/init_osm3s.sh
kind: ConfigMap
metadata:
  name: overpass-flush-patch
  namespace: overpass
---
apiVersion: v1
kind: Service
metadata:
  name: overpass
  namespace: overpass
spec:
  ports:
  - port: 80
  selector:
    app: overpass
  type: ClusterIP
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: overpass
  name: db
  namespace: overpass
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 200Gi
  storageClassName: overpass-storageclass
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: overpass
  namespace: overpass
spec:
  replicas: 1
  selector:
    matchLabels:
      app: overpass
  template:
    metadata:
      labels:
        app: overpass
    spec:
      containers:
      - env:
        - name: OVERPASS_META
          value: "no"
        - name: OVERPASS_MODE
          value: init
        - name: OVERPASS_PLANET_URL
          value: http://download.geofabrik.de/europe/monaco-latest.osm.bz2
        - name: OVERPASS_DIFF_URL
          value: http://download.openstreetmap.fr/replication/europe/monaco/minute/
        - name: OVERPASS_PLANET_SEQUENCE_ID
          value: "3325745"
        - name: OVERPASS_RULES_LOAD
          value: "10"
        image: wiktorn/overpass-api
        name: overpass
        ports:
        - containerPort: 80
          name: overpass
        readinessProbe:
          httpGet:
            path: /api/status
            port: 80
        resources:
          limits:
            cpu: 50m
            memory: 500Mi
        volumeMounts:
        - mountPath: /db
          name: overpass-volume
        - mountPath: /docker-entrypoint-initdb.d
          name: flush-patch
      volumes:
      - emptyDir: {}
        name: overpass-volume
      - configMap:
          defaultMode: 484
          name: overpass-flush-patch
        name: flush-patch

This is not

apiVersion: v1
kind: PersistentVolume
metadata:
  name: overpass-db
spec:
  storageClassName: 'overpass-storageclass'
  capacity:
    storage: 200Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: '/overpass_db/europa/'
---
apiVersion: v1
kind: Namespace
metadata:
  name: overpass
---
apiVersion: v1
data:
  flush_patch.sh: |-
    #!/bin/sh

    # Add custom flush to reduce initial memory usage
    echo "Patching /app/bin/init_osm3s.sh with custom --flush-size"
    sed -i.bck '$s/$/ --flush-size=1/' /app/bin/init_osm3s.sh
kind: ConfigMap
metadata:
  name: overpass-flush-patch
  namespace: overpass
---
apiVersion: v1
kind: Service
metadata:
  name: overpass
  namespace: overpass
spec:
  ports:
  - port: 80
  selector:
    app: overpass
  type: ClusterIP
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: overpass
  name: db
  namespace: overpass
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 200Gi
  storageClassName: overpass-storageclass
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: overpass
  namespace: overpass
spec:
  replicas: 1
  selector:
    matchLabels:
      app: overpass
  template:
    metadata:
      labels:
        app: overpass
    spec:
      containers:
      - env:
        - name: OVERPASS_META
          value: "no"
        - name: OVERPASS_MODE
          value: init
        - name: OVERPASS_PLANET_URL
          value: http://download.geofabrik.de/europe/monaco-latest.osm.bz2
        - name: OVERPASS_DIFF_URL
          value: http://download.openstreetmap.fr/replication/europe/monaco/minute/
        - name: OVERPASS_PLANET_SEQUENCE_ID
          value: "3325745"
        - name: OVERPASS_RULES_LOAD
          value: "10"
        image: wiktorn/overpass-api
        name: overpass
        ports:
        - containerPort: 80
          name: overpass
        readinessProbe:
          httpGet:
            path: /api/status
            port: 80
        resources:
          limits:
            cpu: 100m
            memory: 500Mi
        volumeMounts:
        - mountPath: /overpass_db/europa/
          name: db
        - mountPath: /docker-entrypoint-initdb.d
          name: flush-patch
      restartPolicy: Always
      volumes:
      - name: db
        persistentVolumeClaim:
          claimName: db
      - configMap:
          defaultMode: 484
          name: overpass-flush-patch
        name: flush-patch

For the image version. The pod:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2020-10-22T08:29:40Z"
  generateName: overpass-7c97557db5-
  labels:
    app: overpass
    pod-template-hash: 7c97557db5
  name: overpass-7c97557db5-vv6n2
  namespace: overpass
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: overpass-7c97557db5
    uid: 7202ccec-bced-479c-96e0-3a0867b145b5
  resourceVersion: "494676"
  selfLink: /api/v1/namespaces/overpass/pods/overpass-7c97557db5-vv6n2
  uid: d9afec5e-da2f-4a8b-80d5-d611e7d5ea87
spec:
  containers:
  - env:
    - name: OVERPASS_META
      value: "no"
    - name: OVERPASS_MODE
      value: init
    - name: OVERPASS_PLANET_URL
      value: http://download.geofabrik.de/europe/monaco-latest.osm.bz2
    - name: OVERPASS_DIFF_URL
      value: http://download.openstreetmap.fr/replication/europe/monaco/minute/
    - name: OVERPASS_PLANET_SEQUENCE_ID
      value: "3325745"
    - name: OVERPASS_RULES_LOAD
      value: "10"
    image: wiktorn/overpass-api
    imagePullPolicy: Always
    name: overpass
    ports:
    - containerPort: 80
      name: overpass
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /api/status
        port: 80
        scheme: HTTP
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        cpu: 100m
        memory: 500Mi
      requests:
        cpu: 100m
        memory: 500Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /overpass_db/europa/
      name: db
    - mountPath: /docker-entrypoint-initdb.d
      name: flush-patch
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-ntjbm
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: aks-nodepool1-16538452-vmss000000
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: db
    persistentVolumeClaim:
      claimName: db
  - configMap:
      defaultMode: 484
      name: overpass-flush-patch
    name: flush-patch
  - name: default-token-ntjbm
    secret:
      defaultMode: 420
      secretName: default-token-ntjbm
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2020-10-22T08:29:40Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2020-10-22T08:29:40Z"
    message: 'containers with unready status: [overpass]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2020-10-22T08:29:40Z"
    message: 'containers with unready status: [overpass]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2020-10-22T08:29:40Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://94a8b88ff0f14de1e3a7ce330212ff7f80e2d5471a5ef55491199cc357e75b51
    image: wiktorn/overpass-api:latest
    imageID: docker-pullable://wiktorn/overpass-api@sha256:9f6d12d18ddde892a8d8116223c076bd5503b113450732d5aeb2d4e8894c89f3
    lastState:
      terminated:
        containerID: docker://21a9e544440e6e29b814097a6f07b844cd931144e0a150d726f7d276abae0a36
        exitCode: 0
        finishedAt: "2020-10-22T09:10:29Z"
        reason: Completed
        startedAt: "2020-10-22T09:00:24Z"
    name: overpass
    ready: false
    restartCount: 4
    started: true
    state:
      running:
        startedAt: "2020-10-22T09:10:31Z"
  hostIP: 192.168.101.4
  phase: Running
  podIP: 10.233.1.27
  podIPs:
  - ip: 10.233.1.27
  qosClass: Guaranteed
  startTime: "2020-10-22T08:29:40Z"
joerg-walter-de commented 4 years ago

Current state of affairs:

I have a persistent volume:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: overpass-db
spec:
  storageClassName: 'overpass-storageclass'
  capacity:
    storage: 200Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: '/db'

A claim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: overpass
  name: overpass-volume
spec:
  storageClassName: 'overpass-storageclass'
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 200Gi

And a deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: overpass
  namespace: overpass
spec:
  replicas: 1
  selector:
    matchLabels:
      app: overpass
  template:
    metadata:
      labels:
        app: overpass
    spec:
      restartPolicy: Always
      containers:
      - name: overpass
        image: wiktorn/overpass-api
        resources:
          limits:
            memory: 500Mi
            cpu: 100m
        volumeMounts:
        - 
          mountPath: /db
          name: overpass-volume
        - name: flush-patch
          mountPath: /docker-entrypoint-initdb.d
        readinessProbe:
          httpGet:
            path: /api/status
            port: 80
        ports:
          - name: overpass
            containerPort: 80
        env:
        - name: OVERPASS_META
          value: "yes"
        - name: OVERPASS_MODE
          value: "init"
        - name: OVERPASS_PLANET_URL
          value: "http://download.geofabrik.de/europe/monaco-latest.osm.bz2"
        - name: OVERPASS_DIFF_URL
          value: "http://download.openstreetmap.fr/replication/europe/monaco/minute/"
        - name: OVERPASS_RULES_LOAD
          value: "10"
      volumes:
      - 
        name: overpass-volume
        persistentVolumeClaim:
          claimName: overpass-volume
      - name: flush-patch
        configMap:
          name: overpass-flush-patch
          defaultMode: 0744

This is working insofar as The pod is running correctly after one restart and the API works.

Note: I am using the flux mechanism.

But when I delete the pod with

kubectl delete pod -n overpass podname

and another pod is created and the download and initialisation process starts again instead of using the existing data on the volume.

defyjoy commented 3 years ago

and another pod is created and the download and initialisation process starts again instead of using the existing data on the volume.

Good point . Is there any solution to this ? The use of persistence volume was to avoid this kind of issue to begin with the pod should not redownload entire database and instead use the existing volume and database files .