ssl-hep / storage_cleanup

Microservice to cleanup storage used by ServiceX
0 stars 0 forks source link

MinIO cleanup #1

Open sthapa opened 3 years ago

sthapa commented 3 years ago

Background

ServiceX currently doesn't cleanup the results from a transform request. To work effectively in production, we need to add the ability for ServiceX to cleanup after itself. Initially this will focus on MinIO persistent storage but should be general enough that we can easily extend this to other storage solutions in the future.

MinIO's builtin facilities don't help with this. Bucket lifecycle policies will delete contents of a bucket after a fixed time. FIFO quota policies will delete older objects in order to free space for newer objects when the quota is reached. Both methods delete objects without informing ServiceX

Proposed Solution

A solution for this would be to create a microservice that handles tracking objects stored in persistent storage and deleting them as needed in order to keep storage utilization under a specified quota.

Tracking Storage Utilization

MinIO doesn't have any easy way to track the space used by a bucket. Consequently, it requires some effort to even get the space used by all the transformation results. The current best practice is to iterate over objects within a bucket and then sum up the space used by each object.

Given the ServiceX workflow, we can have this service scan the MinIO storage once a day and get the bucket sizes for any new buckets. Since the outputs of a transform are immutable, we can store this information within the postgresql database and only need to scan a bucket once to get it's size.

Deletion Policy

The service can initially use a policy of deleting the oldest buckets until the storage used is under a configurable threshold in order to ensure that transforms don't run out of space. The service should probably default to a high water mark of 85% but this will be configurable.

API Interface

The service will have a single endpoint that exposes a simple API that should suffice for ServiceX activities:

sthapa commented 3 years ago

Comments? Suggestions?

sthapa commented 3 years ago

I guess I can add a CREATE verb for the endpoint but it doesn't seem useful given that other services are talking directly to minio

AndrewEckart commented 3 years ago

I am not sure that we need a persistent microservice for this. There is a K8s resource named CronJob which allows you to schedule recurring tasks. We could probably just add one of these to the ServiceX Helm chart.

It's not in scope, but I believe @ivukotic has mentioned that our x509 proxy microservice could also be refactored into one of these CronJobs.

One thing of note that I see in the CronJob docs:

A cron job creates a job object about once per execution time of its schedule. We say "about" because there are certain circumstances where two jobs might be created, or no job might be created. We attempt to make these rare, but do not completely prevent them. Therefore, jobs should be idempotent.

I don't think this will be an issue for cleanup/archival of old transformation requests and MinIO storage.

AndrewEckart commented 3 years ago

API Interface The service will have a single endpoint that exposes a simple API that should suffice for ServiceX activities:

  • GET /bucket?id=ID will scan a specified bucket and return the size of that bucket
  • DELETE /bucket?id=ID will attempt to delete a specified bucket

This sounds like a thin wrapper around the MinIO client itself. I think that we should put all such logic in the ObjectStoreManager (which is really a MinIO adaptor). Then the API server can make calls to the appropriate methods when needed.

Basically, I don't think we should be thinking of this as deleting directly from MinIO. Instead, we should think of it as archiving ServiceX transformation requests, which would entail setting a flag in the database as well as deleting the output files from object storage. We should try to make this an atomic operation.

sthapa commented 3 years ago

I am not sure that we need a persistent microservice for this. There is a K8s resource named CronJob which allows you to schedule recurring tasks. We could probably just add one of these to the ServiceX Helm chart.

It's not in scope, but I believe @ivukotic has mentioned that our x509 proxy microservice could also be refactored into one of these CronJobs.

I see this as a building block for future functionality that'll need a persistent microservice. Although a cronjob can do some of this, the bucket deletion will be useful for transform request removal functionality. Basically the removal functionality could just make the REST call and then work on the other things needed to remove a transform.

Also, the bucket size calculation could be called when the api endpoint knows that the transform is completed. It can trigger the calculation and then add that to postgresql while it's updating the transform request information. This amortizes the cost of traversing the buckets.

Neither of these are possible with a k8s cronjob.

I don't have any problem with putting the functionality in the object store and then getting the functionality through there. I don't think we can make updating the DB and deleting the object files atomic. MinIO requires you to delete the objects in a bucket individually and then the empty bucket. Without additional support from MinIO, I don't see how we can make this atomic.

I see this microservice as a compositional tool that can be used by the servicex_api server to do the archival. The microservice will handle removing the saved data for a transformation and the servicex_api server can use this with potentially other bits to create a transform request archive workflow.

ivukotic commented 3 years ago

Don't we have output size known and reported at the end of each transformer? Do we have this summed up per request somewhere in the DB? If not, it would be great to have, and shown on the web dashboard.

Next we should have a nice endpoint that does the deep cleanup of a request:

Once this is in place, and we have a configuration parameter saying what is the size of the storage (in GB not percents), we could simply sum up output size of all the requests. Before processing any new request, we do the sum up and if the sum is above the threshold we call the deep cleanup on the oldest request.

BenGalewsky commented 3 years ago

I agree with @AndrewEckart - this should be a simple cronjob with a new endpoint on the App server. I'm a firm believer in The YAGNI Principal - don't build stuff now for what we might or might not eventually need.

I don't understand the idea of an endpoint to cleanup running transformers. That code should just be in the TransformerManager under shutdown_transformer_job method. So maybe we need a story to add the items identified by @ivukotic to that method. (Apart from the Minio bucket and the database record)

What we do need is an endpoint to archive an transform. Eventually we would want an endpoint to restore a transform by re-generating the output.

One principal of micro service architecture is to never share the database between services. All interactions with our database should go through endpoints on the app. With this in mind, I don't see how this cleanup service could need a separate database.

So I propose:

  1. A new endpoint to archive a transform by uuid
  2. Update the /servicex/transformation GET to include the size of the resulting transform bucket
  3. An option on the /servicex/transformation GET to return the transforms in reverse submission time.
  4. A cron job that queries all Transforms and decides which ones to archive

We might want an endpoint to report total space used in the object store. We want this in the app since we have stories to use external object stores and don't want Minio libraries to extend beyond the app.

AndrewEckart commented 3 years ago
  1. A new endpoint to archive a transform by uuid

Yeah, I would suggest that we put this at either GET /servicex/transformation/<uuid>/archive or DELETE /servicex/transformation/<uuid>. The logic to clean up MinIO (or whatever object store is used) should go in here.

If we are going to keep the records in the database and simply flag them with archived=True, the first endpoint makes more sense. Then we could have an endpoint at GET /servicex/transformation/<uuid>/<unarchive|restore|regenerate> to recreate the output. If we are going to delete the records outright as Ilija suggests, then the latter makes more sense.

Suchandra makes a good point in that we may not be able to make it truly atomic. From a quick search, it doesn't seem like MinIO has any transaction features.

We also need to distinguish between cleanup that happens when all files are done, versus cleanup that happens when a request is archived/deleted. Ilija lists 5 tasks.

These 3 tasks should be done as soon as all files are transformed:

  • cleans up RMQ topic
  • terminates running transformers
  • deletes request configmap

The last 2 tasks should only happen when the request is archived (first one only) or deleted (both):

  • deletes outputs from minio
  • removes it from DB

So one of the first decisions we need to make is whether we want to archive requests or delete them outright. If both options are desired, we could have two endpoints.

ivukotic commented 3 years ago

I don't understand the idea of an endpoint to cleanup running transformers. That code should just be in the TransformerManager under shutdown_transformer_job method. So maybe we need a story to add the items identified by @ivukotic to that method. (Apart from the Minio bucket and the database record)

it is not an endpoint to cleanup running transformers but to completely cleans up a request. Imagine request fails halfway. You still want running transformers killed and that bucket cleaned up. Or a user decides that (s)he made a mistake and does not need that request at all. There should be a way to deep clean up.

What we do need is an endpoint to archive an transform. Eventually we would want an endpoint to restore a transform by re-generating the output.

This might need. But it is actually this that falls under YAGNI. I would not call it archiving but a "shallow" clean. Again, imagine request fails halfway, I might way to re-try it. So we would need to clean up everything but a configmap, and request source.

One principal of micro service architecture is to never share the database between services. All interactions with our database should go through endpoints on the app. With this in mind, I don't see how this cleanup service could need a separate database.

Nobody asked for creation of another database. Current database should have info on how big is the request output.

BenGalewsky commented 3 years ago

it is not an endpoint to cleanup running transformers but to completely cleans up a request. Imagine request fails halfway. You still want running transformers killed and that bucket cleaned up. Or a user decides that (s)he made a mistake and does not need that request at all. There should be a way to deep clean up.

Ah, I see, @ivukotic - this is just a terminate running transform endpoint which would be called by the dashboard if you click cancel. I agree completely that we need this. Consequently this action also needs to delete the transform deployment to shut down all running workers

AndrewEckart commented 3 years ago

it is not an endpoint to cleanup running transformers but to completely cleans up a request. Imagine request fails halfway. You still want running transformers killed and that bucket cleaned up. Or a user decides that (s)he made a mistake and does not need that request at all. There should be a way to deep clean up.

Ah, I see, @ivukotic - this is just a terminate running transform endpoint which would be called by the dashboard if you click cancel. I agree completely that we need this. Consequently this action also needs to delete the transform deployment to shut down all running workers

/servicex/transformation/<request_id>/kill?

BenGalewsky commented 3 years ago

/servicex/transformation/<request_id>/terminate might be a bit less dramatic, but sure!

gordonwatts commented 3 years ago

Going all the way back to the start... for keeping the minio distribution under control, why aren't the minio policies enough? In fact, I'm not even sure talking about this makes a lot of sense until a caching solution is built out (e.g. the same query gets mapped to pre-derived data).

Here are the AF and user situations I can think of, given a query's data has been deleted from minio for some reason:

So - is there a reason to more tightly couple things inside servicex?

I'm definitely ➕1 on the idea of a cancel button. Sometimes it is annoying when I see myself wasting resources and contributing to global warming via a type-o. ;-)

Once caching exists, one could imagine a clean up corn job that would look for different request-id's with the same queries - e.g. duplicate data.

sthapa commented 3 years ago

The available minio policies is a time-based one for buckets. I think that'll be suboptimal because it might delete results before we really need to. The minio policies for handling quota limits a bucket to a hard size or will delete oldest objects in the bucket as new objects get created in order to stay under the quota. The problem with letting minio do cleanup is that servicex won't know the status of the results since there's no communication when minio deletes objects.

I do agree with the use cases that you've outlined and think that they're valid. What I'd like to initially do is just to implement a microservice that keeps servicex storage usage under a given threshold. We can build from there, but I think this is probably the minimum we'll need to get this working in production.

I think the bigger discussion about terminating workflows is useful to have but I'm hoping to keep this focused primarily on how to handle storage needs. As @AndrewEckart suggested, I'm open to using the ObjectManager interface and fleshing it out a bit more so that it can be used by the various servicex components (api server, transformers, this cleanup service). Once we have this implemented and tested, I think we can go back and have a discussion of how to implement the request cleanups.

gordonwatts commented 3 years ago

Does minio have a policy that will delete a whole bucket if it is "old"? Or delete the oldest bucket once it has surpassed some size? If there are things that will delete a bucket at a time, then I think that might work for now. You'd definitely like to implement a least recently access deletion protocol, but even that is tricky due to cache accesses outside of ServicveX knowledge.

Do we know how large the minio cache is now? That is something I've been curious about as I beat it up with lots of requests. 😂😂

sthapa commented 3 years ago

Minio has a policy to delete a objects within a bucket based on age, I don't think there's a way to delete the old bucket though.

We've been using non-persistent minio storage so far so everything gets cleared when you redeploy an instance.