lblod / app-lblod-harvester

Harvesting Self Service
MIT License
1 stars 4 forks source link

Singleton Jobs #34

Closed benjay10 closed 1 year ago

benjay10 commented 1 year ago

DL-5036, DL-5086

All changes needed for supporting singleton-jobs.

A job for a specific subject can only run once at a time. In the case of the harvester, there should only be one harvesting job running at a time for the same URL. This is done by introducing a new service in the pipeline that finds another running job for the same subject and fails the job if so. If not found, the task is successful and the rest of the tasks can be run.

How to test

The changes are based on beta builds of the affected services. Test them out and make sure that all harvesting Jobs still work as expected. Make sure that all harvesting Jobs start with a "singleton-job" task.

aatauil commented 1 year ago

Seems to work as described. I have 2 remarks. When you create a job and you then create a second job (both have same url) they will both fail IF the first job is still in the singleton-job status while the second one is created.

Second remark is that the error message doesnt get saved to the database by default I think? So when a job fails in the singleton state, it wont display an error message in the frontend to the user.

These remarks I would classify more as future improvements. I will log them in the singleton service repo as issues

aatauil commented 1 year ago

ref to opened issues: https://github.com/lblod/harvesting-singleton-job-service/issues/2 https://github.com/lblod/harvesting-singleton-job-service/issues/1

benjay10 commented 1 year ago

When you create a job and you then create a second job (both have same url) they will both fail IF the first job is still in the singleton-job status while the second one is created.

I don't think that will be the case due to the locking used in the singleton-job-service. I'll try to explain a bit better in the issue page above.

Second remark is that the error message doesnt get saved to the database by default I think? So when a job fails in the singleton state, it wont display an error message in the frontend to the user.

The singleton-job-service references potential errors in the tasks that are being processed. The jobs-dashboard just does not show the error messages. We'll have to look into this, because I have never seen the error reporting actually work in the jobs-dashboard.