Open krobasky opened 2 years ago
Perhaps the solution is to keep a ledger of deletion requests and create a script that can read the ledger and intermittently cleanup the unlinked data objects. Remember, 'delete' has a different meaning from 'deprecate' - if data is modified at the service provider, it can be deprecated so that future analyses won't use it, but needs to remain in the local cache in support of legacy analyses that used the old version. However if data is redacted or corrupted, it needs to be deleted entirely. This ticket concerns the latter.
It looks like right now, the code is saving all the run parameters and results to files mounted by the container, and the metadata is being stored in tx-persistent, but when the containers are restarted the metadata is lost and not reconstructed from the mounted directory. So we need to either cleanup the mounted volume upon exit (not advised) or somehow reconstruct the metadata from the mounted volume upon restart. I wonder if we should save the metadata on the mounted file system instead of mongo since we rarely if every edit the data.
To reproduce the problem:
docker-compose -f docker-compose.yml up --build -V -d