dbpedia / databus

A digital factory platform for managing files online with stable IDs, high-quality metadata, powerful API and tools for building on data: find, access, make interoperable, re-use
Apache License 2.0
41 stars 17 forks source link

Databus on Kubernetes #95

Open manonthegithub opened 1 year ago

manonthegithub commented 1 year ago

Running multiple instances with mods on Kubernetes

manonthegithub commented 1 year ago

About the storage. I think we could use ceph (https://ceph.io/en/) or hdfs and mount them as cephfs or nfs to Kubernetes as persistent volumes that would make our data storage solution reliable and and distributed, not dependent on particular physical server. This needs testing though... Maybe for Virtuoso would still be cool to have just a physical storage (for performance reasons). I would start with ceph for experimenting.

kurzum commented 1 year ago

Ceph.io seems cool. I found this: https://medium.com/@keecheril.jobin/integrating-docker-with-ceph-56ce6c447d1b

They disable object-map, fast-diff, deep-flatten for docker. I mean ceph.io can be mounted as a normal file system, so it might work. However, it get's much slower: https://yourcmc.ru/wiki/Ceph_performance

manonthegithub commented 1 year ago

I think it’s performance should be enough for writing metadata and data files… for SPARQL endpoint this may be too slow… this needs thorough testing, which is unlikely we will do atm, but at the start this should be good enough