Closed rhliang closed 9 years ago
Did a part of this today. For the first point, it's currently being passed as dataset names. Since the backend rejects these I created a new branch since functionality is temporarily broken.
Since an interface for deleting data files may become extremely complicated (selecting data files associated with specific pipelines, versions), the most feasible approach may be to define global scope criteria such as the creation date of intermediate data files for all pipelines, or the number of times a data set has been accessed. Continue discussion offline for now.
After some design discussion, this is the strategy:
Deleting a dataset is causing some problems when we search for exec records. I'll try looking at how we handle PipelineStep.outputs_to_delete
.
Just refreshed myself on where I left the ImplementOutputsToDelete branch. I think I'm just waiting for the backend to accept the new form data. Is someone available to help bring that up to speed?
Sure I can help with that.
If we're going to batch jobs, we better be able to clean up the system alongside it! We're starting to feel the space crunch on the cluster right now.
outputs_to_delete
as either a list of dataset indices or dataset names (Josh, Richard, and James all prefer names)On the View Run page, represent discarded data somehow: perhaps the cable is shaded in a colour other than green or in a different shade of greenMoved to new issue: #434.Updated Description
We decided to simplify the strategy for this issue.
RunOutputsSerializer.get_input_summary()
when inputs have been purged.