ThomasThelen commented 3 years ago

This PR addresses a few issues: #42 #34 #35 which are centered around applying a new triple pattern across the data and updating virtuoso.

Changes

Image Management

Previously, the docker/ folder acted as the place to build the scheduler and worker images. Because the workers and scheduler depend on the d1lod library, and because Dockerfiles can only copy same-directory or lower files, the d1lod folder had to be copied into each folder. Additionally, there were separate images for the workers and scheduler. This was error prone and tedious.

The changes in 16c9632 and ae271ea install the d1lod package into the d1lod image (which I have at thomasthelen/slinky). This image is used in the workers and scheduler, getting rid of the need for their separate images.

Worker Changes

The worker deployment has changed by deploying separate workers, one for the default queue and the other for the dataset queue. These changes are in 4b78c5c.

Scheduler Changes

In 3051b99dbf5020f5983f057249ddc597981ffeaa the scheduler deployment was updated to use the slinky cli.

Virtuoso Changes

Instead of having a second container interact with Virtuoso to enable updating, I combined the openlink virtuoso image with the shell script to enable updates. The virtuoso server an the update-enabler are run in separate processes, shown below. This was done in 913791c

Networking Changes

When pods in separate deployments need to communicate, they have to go through the associated service. Services can be referenced the same way in docker (ie 'redis' or 'virtuoso'). 50aa722 has a few changes to accommodate this requirement.

Misc

832c7a2 changes the name of the redis deployment to just 'redis'. We don't have a need to distinguish redis deployments, and this makes it more aligned with current names.

8e03b92 Fixes a bug where, without init.py pip won't include the contents of the folder in the install package path.

Testing

Kubernetes

A good kubbernetes test is to tear down the stack and then bring it back up and check that Virtuoso has triples.

kubectl config use-context slinky
kubectl delete deployment scheduler redis virtuoso worker-default worker-dataset
kubectl apply -f virtuoso-deployment.yaml
kubectl apply -f redis-deployment.yaml
kubectl apply -f virtuoso-deployment.yaml
kubectl apply -f worker-default-deployment.yaml
kubectl apply -f virtuoso-dataset-deployment.yaml
kubectl apply -f scheduler-deployment.yaml
kubectl proxy
Visit http://127.0.0.1:8001/api/v1/namespaces/slinky/services/virtuoso:virtuoso/proxy/sparql/ and check for triples

An optional extra step after deployment is viewing the logs for the workers to see if any jobs have been processed.

Unit Tests

The unit tests should be run the usual way. Without the RDF module locally I run into issues running the tests locally; the requirement for running virtuoso or blazegraph alongside them complicated things (docker in docker territory).

I was able to run the non-networking tests which seemed to pass okay, including the local triple store tests. @amoeba is there any way you can run these on your local machine to test out?

ThomasThelen commented 2 years ago

This PR should be ready for review (see my note about testing). There's a followup PR, #48 that's forked off of this branch. It has some nice features added to the deployment and documentation; after reading through this PR I suggest taking a look at the other changes for a bigger picture view.

amoeba commented 2 years ago

Thanks @ThomasThelen! I appreciate you coming up to speed on how this service works and getting everything deploying correctly.

This PR got kinda too big to give an effective review but since we ran through things together last week, I'm going to merge and we can go from there.

I did do a scan and I two things did pop up:

Use of a custom Virtuoso image (thomasthelen/virtuoso) referenced in various places. It'd be better to either use the stock image or build one under DataONE org. No need to do that right now.
I think it'd be good to do a pass over how we're doing service discovery and configuration. In my rewrite, I hard-coded service names, some of which are still in the code and should be changed. You added some flags, like --prod which help get around that but I think the best overall solution is to have all connections to default to localhost with default ports and override any hostnames and ports with environmental variables (ala 12F). We can also work on that post-merge.