lblod / harvesting-singleton-job-service

Service that integrates in the harvesting stack that makes sure a job is only run once at a time for a specific subject.
MIT License
0 stars 0 forks source link

harvesting-singleton-jobs-service

This service reacts to messages from the delta-notifier about tasks in the harvester stack. This service should be configured in the jobs-controller to work as the first task in a pipeline, debouncing tasks for subjects that are currently already busy in a different Job. For this, it searches for a Job in the triplestore on the same subject as the newly started Job. If such Job can be found, the newly created task (and therefore the Job as well) fails. If not, the new Job can continue to run.

How it works

This is one of the services that can be configured to use in the harvesting application. It listens to scheduled tasks for the http://lblod.data.gift/id/jobs/concept/TaskOperation/singleton-job operation and queries the triplestore for other Jobs with the same subject as the current Job.

Adding to a stack

To add the service to a mu-semtech stack (probably something like a harvester), add the following snippet to the docker-compose.yml file as a service:

harvesting-singleton-job:
  image: lblod/harvesting-singleton-job-service:1.0.0

To make sure the delta-notifier sends the needed messages, add the following snippet to the rules.js file:

{
  match: {
    predicate: {
      type: 'uri',
      value: 'http://www.w3.org/ns/adms#status',
    },
    object: {
      type: 'uri',
      value: 'http://redpencil.data.gift/id/concept/JobStatus/scheduled',
    },
  },
  callback: {
    method: 'POST',
    url: 'http://harvesting-singleton-job/delta',
  },
},

As an example, the following snippet from the jobs-controllers config.json shows how the jobs-controller can be configured to incorporate this service:

{
  "currentOperation": null,
  "nextOperation": "http://lblod.data.gift/id/jobs/concept/TaskOperation/singleton-job",
  "nextIndex": "0"
},
{
  "currentOperation": "http://lblod.data.gift/id/jobs/concept/TaskOperation/singleton-job",
  "nextOperation": "http://lblod.data.gift/id/jobs/concept/TaskOperation/collecting",
  "nextIndex": "1"
},

API

POST /delta

Main entry point for this service. This is where delta messages arrive. Returns a 200 OK as soon as the request is being handled.

Configuration

These are environment variables that can be used to configure this service. Supply a value for them using the environment keyword in the docker-compose.yml file.

Environment variables