Closed chryswoods closed 5 years ago
Hi @chryswoods thanks for the issue. Is this specifically to prevent disk full issues?
I noticed we actually removed the API call at one point per issue #481 but there's some thought in there about if/when/how it might be added back in. Any thoughts?
Yes, this is to prevent disk full issues. The use case is using Fn to run molecular dynamics simulations on demand as individual functions. Fn provides the serverless interface for simulations that are initiated, queried and then collected from Jupyter notebooks running in a k8s cluster. This allows the notebooks to use relatively low-powered cloud instances for the k8s cluster, with big fat nodes in the cloud or on-premise used on-demand to run simulations when they are invoked as a Fn function.
The simulations are long, and so async functions are needed. The Jupyter interface will capture the CALL_ID and then use this later to query when the simulation has finished, and to then get the ID of the data cache in which the data is generated. The user will transfer the data from the data cache to storage connected to the notebook using this data ID. Once the data has been transferred, the system should delete everything related to the request. This is mainly to prevent disk-full issues, but also for data governance whereby commercial users would not want records of the simulation run to be retained on the system for any longer than was necessary to run the calculation.
Looking at #481 I understand that you don't want users to be able to delete their logs for auditing reasons. The compromise of using a cleaner to remove everything completed that is older than 7 days is not an option as it could clean out the results of an async function before the user has a chance to collect the results. Perhaps a better idea would be for the user to be able to flag a call as deleted, but this only sets a "deletable" flag on the record? Then a cleaner periodically prunes the database to remove all "deletable" records that are older than a set time (e.g. 7 or 30 days)? Essentially, deleting really moves things to trash, and only the sysadmin or the cleaner script can remove things from the trash?
thanks @chryswoods I think a configurable cleaner interval is what we were thinking about, to have a better out of the box xp around this, deleting all call and log records before some set time remedies one issue surfaced in #481.
it may be easier in the short term to have DELETE /calls?before="01/01/01 01:01:01.00"
to delete records, our api at present doesn't lend well to this but we could add it as an admin-style endpoint, need to think on this one some more - open to ideas here. thanks for surfacing this.
Thanks @rdallman - I will add the short term DELETE as you suggest. I can also short term add in a layer above Fn that marks records for later deletion for specific jobs when the user has moved files. I'm happy to feed back how this works for my use-case, so that you can see whether or not this would be useful in mainline.
calls are removed from API now
I am enjoying working with Fn, especially using asynchronous functions. An issue is that there is no easy way to remove data created from old function calls. Currently I have to edit the log and output databases manually. It would be useful to have the ability to do this with Fn directly, e.g.
fn calls delete APP CALL_ID
orfn logs delete APP CALL_ID
would remove all data associated with the specified call ID in the specified application. This would prevent the server from filling up with old data from old calls. It would also allow me to build logic into my application that detects when a user has retrieved the output from a call, and so safely delete that data from the server.