Set up `vfb-pipeline-dumps` pipeline

matentzn commented 4 years ago

We want a pipeline that writes selected subsets of the triplestore into OWL files that can be loaded into owlery or prod.

matentzn commented 4 years ago

Ok so I drafted the pipeline:

The next step is to add it to Rancher. @Robbie1977 I think this container should be run individually (isolated pipeline step, not mixed with other container runs) and can be triggered by vfb-triplestore.

matentzn commented 4 years ago

The only variable to be set is: SPARQL_ENDPOINT which should point to (the symlink of) http://ts.p2.virtualflybrain.org/rdf4j-server/repositories/vfb, and you need to bind a volume to /out.

matentzn commented 4 years ago

For me, what remains to be done (once this is deployed):

[x] Let owlery point to the owlery dump
[x] Let pdb point to pdb dump instead of sparql

@Robbie1977 what is the best way here: 1) mount a volume and load from there (disadvantage: unless we copy files around, the containers will have briefly access to files they dont need (other files in the volume); advantage: less network traffic) 2) Provide a stable symlink/url to the files (?? possible) and pass them into the containers (2 hrs less implementation effort, because we don't need to change any code, but may be slower on the network). With a bit of luck compressed files work out of the box for both owlery and neo4j2owl.

Thanks for the input

matentzn commented 4 years ago

After slack talk with Robbie we decided to do 2.

matentzn commented 4 years ago

I will close this in favour of more specific tickets. The pipeline works now in general.

VirtualFlyBrain / vfb-pipeline-config

Set up `vfb-pipeline-dumps` pipeline #15