Closed nathanwaters closed 5 years ago
Can you share the error that you get? I don't have any k8s installation at hand right now.
Here's the current errors. I've bumped up resource limits to 20GB RAM and 8 CPUs. And also added --flush-size=1 to the init_osm3.sh command.
Been having issues with every Overpass API docker variant I can find. They all build fine locally and build fine on gcloud. But deploying to Kubernetes they all fail :(
E Reading XML file ...
E bunzip2: I/O or other error, bailing out. Possible reason follows.
E bunzip2: Cannot allocate memory
E Input file = (stdin), output file = (stdout)
E Parse error at line 8215853:
E unclosed token
E Reading XML file .../app/bin/init_osm3s.sh: line 44: 10 Broken pipe bunzip2 < $PLANET_FILE
E 11 Killed | $EXEC_DIR/bin/update_database --db-dir=$DB_DIR/ $META $COMPRESSION
E Error: Format string '/app/bin/fetch_osc.sh auto %(ENV_OVERPASS_DIFF_URL)s /db/diffs' for 'program:fetch_diff.command' contains names ('ENV_OVERPASS_DIFF_URL') which cannot be expanded. Available names: ENV_GEOROUTE_PORT, ENV_GEOROUTE_PORT_80_TCP, ENV_GEOROUTE_PORT_80_TCP_ADDR, ENV_GEOROUTE_PORT_80_TCP_PORT, ENV_GEOROUTE_PORT_80_TCP_PROTO, ENV_GEOROUTE_SERVICE_HOST, ENV_GEOROUTE_SERVICE_PORT, ENV_HOME, ENV_HOSTNAME, ENV_KUBERNETES_PORT, ENV_KUBERNETES_PORT_443_TCP, ENV_KUBERNETES_PORT_443_TCP_ADDR, ENV_KUBERNETES_PORT_443_TCP_PORT, ENV_KUBERNETES_PORT_443_TCP_PROTO, ENV_KUBERNETES_SERVICE_HOST, ENV_KUBERNETES_SERVICE_PORT, ENV_KUBERNETES_SERVICE_PORT_HTTPS, ENV_NGINX_VERSION, ENV_NJS_VERSION, ENV_OVERPASS_RULES_LOAD, ENV_PATH, ENV_PWD, ENV_SHLVL, ENV_WIKTORN_SERVICE_PORT, ENV_WIKTORN_SERVICE_PORT_80_TCP, ENV_WIKTORN_SERVICE_PORT_80_TCP_ADDR, ENV_WIKTORN_SERVICE_PORT_80_TCP_PORT, ENV_WIKTORN_SERVICE_PORT_80_TCP_PROTO, ENV_WIKTORN_SERVICE_SERVICE_HOST, ENV_WIKTORN_SERVICE_SERVICE_PORT, group_name, here, host_node_name, process_num, program_name in section 'program:fetch_diff' (file: '/etc/supervisor/conf.d/supervisord.conf')
Probably this is connected more to size of the data that you're trying to load and less - to the fact that you're running on K8S. Have you tried loading small extracts such as Monaco?
I managed to run the entire extract of Belgium on our Kubernetes cluster with only 2GB of memory allocated to the pod. (Takes about an hour to 2 hours)
resources:
limits:
memory: "2Gi"
I also added a small startup script that adds the --flush-size
flag to the init script automatically.
This way there's no need for any Dockerfile customization.
kind: ConfigMap
apiVersion: v1
metadata:
name: overpass-flush-patch
data:
flush_patch.sh: |-
#!/bin/sh
# Add custom flush to reduce initial memory usage
echo "Patching /app/bin/init_osm3s.sh with custom --flush-size"
sed -i.bck '$s/$/ --flush-size=1/' /app/bin/init_osm3s.sh
Then mount in your deployment file under /docker-entrypoint-initdb.d
:
volumeMounts:
- name: flush-patch
mountPath: /docker-entrypoint-initdb.d
And make sure it's executable (defaultMode: 0744
):
volumes:
- name: flush-patch
configMap:
name: overpass-flush-patch
defaultMode: 0744
@MichielDeMey great news. I guess will pull up --flush-size
as environment variable.
I've experience with deploying extract of Poland on Google Cloud Run. Without --flush-size
I was using 7.5GB to initially load the database. But for running queries even 1GB was enough, though some more expensive ones failed to run even with 2GB of memory (which is max for GCR)
Added OVERPASS_FLUSH_SIZE in 89ada3f
I managed to replicate a working system from the manifests above. I tried to replace the volume with a PersistentVolume and a claim. Not it seems that after the overpass pod has run its initialisation course and is restarted again the initialisation starts over. This process seems to go forever. I am new to PersistentVolume objects. May the reason be a configuration mistake in that area?
I can't find any errors in what you describe about PersistentVolume objects.
What you observe (container starting, initializing and then starting over) suggests, that on each start the container gets brand new volume. Can you change the policy to not restart automatically and inspect contents of the PV? Like if there is the file /db/init_done
present?
You may also consider burining the data into the image itself and update it only by deploying new version of the container. It improves startup time (as K8S needs to download big image), but reduces greatly number of moving parts and gives you full control over when to update. Also it provides easy scale-out strategy.
I have:
apiVersion: v1
kind: PersistentVolume
metadata:
name: overpass-db
spec:
storageClassName: 'overpass-storageclass'
capacity:
storage: 200Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: '/overpass_db/europa/'
apiVersion: apps/v1
kind: Deployment
metadata:
name: overpass
namespace: overpass
spec:
replicas: 1
selector:
matchLabels:
app: overpass
template:
metadata:
labels:
app: overpass
spec:
containers:
- name: overpass
image: wiktorn/overpass-api
resources:
limits:
memory: 500Mi
cpu: 100m
volumeMounts:
# - mountPath: /db
# name: overpass-volume
-
mountPath: /overpass_db/europa/
name: db
- name: flush-patch
mountPath: /docker-entrypoint-initdb.d
readinessProbe:
httpGet:
path: /api/status
port: 80
ports:
- name: overpass
containerPort: 80
env:
- name: OVERPASS_META
value: "no"
- name: OVERPASS_MODE
value: "init"
- name: OVERPASS_PLANET_URL
value: "http://download.geofabrik.de/europe/monaco-latest.osm.bz2"
- name: OVERPASS_DIFF_URL
value: "http://download.openstreetmap.fr/replication/europe/monaco/minute/"
- name: OVERPASS_PLANET_SEQUENCE_ID
value: "3325745"
- name: OVERPASS_RULES_LOAD
value: "10"
volumes:
# - name: overpass-volume
# emptyDir: {}
-
name: db
persistentVolumeClaim:
claimName: db
- name: flush-patch
configMap:
name: overpass-flush-patch
defaultMode: 0744
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app: overpass
name: db
spec:
storageClassName: 'overpass-storageclass'
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
kind: ConfigMap
apiVersion: v1
metadata:
name: overpass-flush-patch
namespace: overpass
data:
flush_patch.sh: |-
#!/bin/sh
# Add custom flush to reduce initial memory usage
echo "Patching /app/bin/init_osm3s.sh with custom --flush-size"
sed -i.bck '$s/$/ --flush-size=1/' /app/bin/init_osm3s.sh
apiVersion: v1
kind: Service
metadata:
name: overpass
namespace: overpass
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: overpass
The volume doesn't seem to be recreated.
Can you change the policy to not restart automatically and inspect contents of the PV? Like if there is the file /db/init_done present?
Will do!
I run into this:
I get and on the machine
Can you tell me which version of the overpass image you're running? It looks really strange, as there is following snippet:
&& touch /db/init_done \
&& rm /db/planet.osm.bz2 \
&& chown -R overpass:overpass /db \
&& echo "Overpass ready, you can start your container with docker start" \
&& exit
And you have Overpass ready, you can start your container with docker start
on the console, which means that touch /db/init_done
executed successfully. On the other hand, when you show the contents of the image, this file is missing in /db
directory. Also /db/planet.osm.bz2
is still there and should be removed according to above.
The /db/init_done
sentinel file was added ~1 year ago.
What kind of storage you're using for Persistent Volumes? And is it possible that it looses "last minute changes", before the container shut down?
Mind that the content of the dirs is after the unfortunate automatic restart of the container. So when it is deleted at restart for some reason it might have existed before the restart
For the persistent storage see manifests above
This is working:
apiVersion: v1
kind: Namespace
metadata:
name: overpass
---
apiVersion: v1
data:
flush_patch.sh: |-
#!/bin/sh
# Add custom flush to reduce initial memory usage
echo "Patching /app/bin/init_osm3s.sh with custom --flush-size"
sed -i.bck '$s/$/ --flush-size=1/' /app/bin/init_osm3s.sh
kind: ConfigMap
metadata:
name: overpass-flush-patch
namespace: overpass
---
apiVersion: v1
kind: Service
metadata:
name: overpass
namespace: overpass
spec:
ports:
- port: 80
selector:
app: overpass
type: ClusterIP
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app: overpass
name: db
namespace: overpass
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
storageClassName: overpass-storageclass
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: overpass
namespace: overpass
spec:
replicas: 1
selector:
matchLabels:
app: overpass
template:
metadata:
labels:
app: overpass
spec:
containers:
- env:
- name: OVERPASS_META
value: "no"
- name: OVERPASS_MODE
value: init
- name: OVERPASS_PLANET_URL
value: http://download.geofabrik.de/europe/monaco-latest.osm.bz2
- name: OVERPASS_DIFF_URL
value: http://download.openstreetmap.fr/replication/europe/monaco/minute/
- name: OVERPASS_PLANET_SEQUENCE_ID
value: "3325745"
- name: OVERPASS_RULES_LOAD
value: "10"
image: wiktorn/overpass-api
name: overpass
ports:
- containerPort: 80
name: overpass
readinessProbe:
httpGet:
path: /api/status
port: 80
resources:
limits:
cpu: 50m
memory: 500Mi
volumeMounts:
- mountPath: /db
name: overpass-volume
- mountPath: /docker-entrypoint-initdb.d
name: flush-patch
volumes:
- emptyDir: {}
name: overpass-volume
- configMap:
defaultMode: 484
name: overpass-flush-patch
name: flush-patch
This is not
apiVersion: v1
kind: PersistentVolume
metadata:
name: overpass-db
spec:
storageClassName: 'overpass-storageclass'
capacity:
storage: 200Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: '/overpass_db/europa/'
---
apiVersion: v1
kind: Namespace
metadata:
name: overpass
---
apiVersion: v1
data:
flush_patch.sh: |-
#!/bin/sh
# Add custom flush to reduce initial memory usage
echo "Patching /app/bin/init_osm3s.sh with custom --flush-size"
sed -i.bck '$s/$/ --flush-size=1/' /app/bin/init_osm3s.sh
kind: ConfigMap
metadata:
name: overpass-flush-patch
namespace: overpass
---
apiVersion: v1
kind: Service
metadata:
name: overpass
namespace: overpass
spec:
ports:
- port: 80
selector:
app: overpass
type: ClusterIP
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app: overpass
name: db
namespace: overpass
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
storageClassName: overpass-storageclass
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: overpass
namespace: overpass
spec:
replicas: 1
selector:
matchLabels:
app: overpass
template:
metadata:
labels:
app: overpass
spec:
containers:
- env:
- name: OVERPASS_META
value: "no"
- name: OVERPASS_MODE
value: init
- name: OVERPASS_PLANET_URL
value: http://download.geofabrik.de/europe/monaco-latest.osm.bz2
- name: OVERPASS_DIFF_URL
value: http://download.openstreetmap.fr/replication/europe/monaco/minute/
- name: OVERPASS_PLANET_SEQUENCE_ID
value: "3325745"
- name: OVERPASS_RULES_LOAD
value: "10"
image: wiktorn/overpass-api
name: overpass
ports:
- containerPort: 80
name: overpass
readinessProbe:
httpGet:
path: /api/status
port: 80
resources:
limits:
cpu: 100m
memory: 500Mi
volumeMounts:
- mountPath: /overpass_db/europa/
name: db
- mountPath: /docker-entrypoint-initdb.d
name: flush-patch
restartPolicy: Always
volumes:
- name: db
persistentVolumeClaim:
claimName: db
- configMap:
defaultMode: 484
name: overpass-flush-patch
name: flush-patch
For the image version. The pod:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2020-10-22T08:29:40Z"
generateName: overpass-7c97557db5-
labels:
app: overpass
pod-template-hash: 7c97557db5
name: overpass-7c97557db5-vv6n2
namespace: overpass
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: overpass-7c97557db5
uid: 7202ccec-bced-479c-96e0-3a0867b145b5
resourceVersion: "494676"
selfLink: /api/v1/namespaces/overpass/pods/overpass-7c97557db5-vv6n2
uid: d9afec5e-da2f-4a8b-80d5-d611e7d5ea87
spec:
containers:
- env:
- name: OVERPASS_META
value: "no"
- name: OVERPASS_MODE
value: init
- name: OVERPASS_PLANET_URL
value: http://download.geofabrik.de/europe/monaco-latest.osm.bz2
- name: OVERPASS_DIFF_URL
value: http://download.openstreetmap.fr/replication/europe/monaco/minute/
- name: OVERPASS_PLANET_SEQUENCE_ID
value: "3325745"
- name: OVERPASS_RULES_LOAD
value: "10"
image: wiktorn/overpass-api
imagePullPolicy: Always
name: overpass
ports:
- containerPort: 80
name: overpass
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /api/status
port: 80
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: 100m
memory: 500Mi
requests:
cpu: 100m
memory: 500Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /overpass_db/europa/
name: db
- mountPath: /docker-entrypoint-initdb.d
name: flush-patch
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-ntjbm
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: aks-nodepool1-16538452-vmss000000
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: db
persistentVolumeClaim:
claimName: db
- configMap:
defaultMode: 484
name: overpass-flush-patch
name: flush-patch
- name: default-token-ntjbm
secret:
defaultMode: 420
secretName: default-token-ntjbm
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2020-10-22T08:29:40Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2020-10-22T08:29:40Z"
message: 'containers with unready status: [overpass]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2020-10-22T08:29:40Z"
message: 'containers with unready status: [overpass]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2020-10-22T08:29:40Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://94a8b88ff0f14de1e3a7ce330212ff7f80e2d5471a5ef55491199cc357e75b51
image: wiktorn/overpass-api:latest
imageID: docker-pullable://wiktorn/overpass-api@sha256:9f6d12d18ddde892a8d8116223c076bd5503b113450732d5aeb2d4e8894c89f3
lastState:
terminated:
containerID: docker://21a9e544440e6e29b814097a6f07b844cd931144e0a150d726f7d276abae0a36
exitCode: 0
finishedAt: "2020-10-22T09:10:29Z"
reason: Completed
startedAt: "2020-10-22T09:00:24Z"
name: overpass
ready: false
restartCount: 4
started: true
state:
running:
startedAt: "2020-10-22T09:10:31Z"
hostIP: 192.168.101.4
phase: Running
podIP: 10.233.1.27
podIPs:
- ip: 10.233.1.27
qosClass: Guaranteed
startTime: "2020-10-22T08:29:40Z"
Current state of affairs:
I have a persistent volume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: overpass-db
spec:
storageClassName: 'overpass-storageclass'
capacity:
storage: 200Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: '/db'
A claim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app: overpass
name: overpass-volume
spec:
storageClassName: 'overpass-storageclass'
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
And a deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: overpass
namespace: overpass
spec:
replicas: 1
selector:
matchLabels:
app: overpass
template:
metadata:
labels:
app: overpass
spec:
restartPolicy: Always
containers:
- name: overpass
image: wiktorn/overpass-api
resources:
limits:
memory: 500Mi
cpu: 100m
volumeMounts:
-
mountPath: /db
name: overpass-volume
- name: flush-patch
mountPath: /docker-entrypoint-initdb.d
readinessProbe:
httpGet:
path: /api/status
port: 80
ports:
- name: overpass
containerPort: 80
env:
- name: OVERPASS_META
value: "yes"
- name: OVERPASS_MODE
value: "init"
- name: OVERPASS_PLANET_URL
value: "http://download.geofabrik.de/europe/monaco-latest.osm.bz2"
- name: OVERPASS_DIFF_URL
value: "http://download.openstreetmap.fr/replication/europe/monaco/minute/"
- name: OVERPASS_RULES_LOAD
value: "10"
volumes:
-
name: overpass-volume
persistentVolumeClaim:
claimName: overpass-volume
- name: flush-patch
configMap:
name: overpass-flush-patch
defaultMode: 0744
This is working insofar as The pod is running correctly after one restart and the API works.
Note: I am using the flux mechanism.
But when I delete the pod with
kubectl delete pod -n overpass podname
and another pod is created and the download and initialisation process starts again instead of using the existing data on the volume.
and another pod is created and the download and initialisation process starts again instead of using the existing data on the volume.
Good point . Is there any solution to this ? The use of persistence volume was to avoid this kind of issue to begin with the pod should not redownload entire database and instead use the existing volume and database files .
Hi
I'm trying to deploy this to (google) kubernetes but it's not working. Any ideas on what I need to change in this yaml?