Open manonthegithub opened 1 year ago
About the storage. I think we could use ceph (https://ceph.io/en/) or hdfs and mount them as cephfs or nfs to Kubernetes as persistent volumes that would make our data storage solution reliable and and distributed, not dependent on particular physical server. This needs testing though... Maybe for Virtuoso would still be cool to have just a physical storage (for performance reasons). I would start with ceph for experimenting.
Ceph.io seems cool. I found this: https://medium.com/@keecheril.jobin/integrating-docker-with-ceph-56ce6c447d1b
They disable object-map, fast-diff, deep-flatten for docker. I mean ceph.io can be mounted as a normal file system, so it might work. However, it get's much slower: https://yourcmc.ru/wiki/Ceph_performance
I think it’s performance should be enough for writing metadata and data files… for SPARQL endpoint this may be too slow… this needs thorough testing, which is unlikely we will do atm, but at the start this should be good enough
Running multiple instances with mods on Kubernetes