elixir-cloud-aai / tesk-api

GA4GH TES API Service that translates tasks into Kubernetes Batch API calls
Apache License 2.0
7 stars 18 forks source link

Task jobs cleaning strategy #21

Open Brico87 opened 3 years ago

Brico87 commented 3 years ago

Hi ! First of all, thanks for your amazing work on implementing the TES API with a Kubernetes-based solution ! We would like to use your project and even contribute if the project is still active and if contributions are allowed and welcomed.

We were wondering what is your task jobs cleaning strategy. Do you use a custom cronjob to clean them once completed or do you have another mechanism ?

We plan to use the Kubernetes TTL Controller with the ttlSecondsAfterFinished spec property of a Job. For that, the Kubernetes Java client needs to be upgraded to the latest version. Is it something that you could be interested of ? We are ready to do a PR with that update.

Cheers !

aniewielska commented 3 years ago

Hi! TESK project is still active, although the code itself might move to a different place in the near future (should be fairly transparent to users/contributors). We do accept contributions. Updating the K8s client is sth I was about to do myself soon. I believe the client is not exactly backwards compatible, so some changes in the code apart from bumping the version are likely going to be necessary. You can wait for me (should be a couple of days), but I will be more than happy to accept a PR with the client update as well, if you are ready. Jobs deletion strategy is not there at the moment. Keeping the jobs forever as K8s objects is not a long term viable option, so one solution is deleting them after some time. CronJob was considered at one point, but never made it as an official part of TESK. Adding an optional TTL for TESK Jobs is a nice solution and I would welcome a PR around it as well. Other options that were considered is additionally adding a task delete endpoint for use cases where tasks are no longer needed after their state was retrieved and to let the clients decide. Finally, the lack of external storage for task metadata is one of TESK features (almost no dependencies apart from K8s), which is cool for prototyping, demo installations, but adding a regular DB for task metadata to be able to store finished tasks forever/long term without impairing performance has also been considered, but never implemented.

Brico87 commented 3 years ago

Hi ! We could update the K8s client (with the version 12.0.0) and we could push a PR for that (there is some changes indeed). Regarding the TTL for jobs, we could do it once the K8s client update is merged. Is it okey for you ?

Cheers !