populse / populse_mia

Multiparametric Image Analysis
Other
10 stars 9 forks source link

[data management] remove data without using databrowser #302

Closed manuegrx closed 12 months ago

manuegrx commented 1 year ago

It could be useful to be able to remove data from database without using the data browser, for example to remove all the intermediate files from a process (in order to save space).

At least two situations exists:

  1. we want to get an option in the pipeline or in the brick to remove or to keep some files In this case , it seems that we need to get access to the database in the runtime. We also need to find a solution to specify which file we want to keep and which file we want to remove. We can discuss about those points in this ticket to find a viable solution

  2. we want to remove the files after the run of the pipeline or the brick For this situations I propose to use a brick in mia_processess . The input of this brick should be an output file from a brick or a pipeline. All the outputs from this file history will be removed. If “to_keep_filters” (a list of regex) is used, the files macthing the regex of the filter will be kept. If “to_remove_filter” ((a list of regex) is used, the files macthing the regex of the filter will be deleted.

image

I created a pull request for this brick (https://github.com/populse/mia_processes/pull/32). It seems to work properly and this does not seem to cause any problems in database management for the remaining files. @servoz if it is okay for you I will merge this pull request in order to have a first solution to remove data.