Closed ambrussimon closed 6 years ago
Merging #1030 into master will increase coverage by
0.4%
. The diff coverage is93.97%
.
@@ Coverage Diff @@
## master #1030 +/- ##
========================================
+ Coverage 90.7% 91.1% +0.4%
========================================
Files 50 50
Lines 6864 6892 +28
========================================
+ Hits 6226 6279 +53
+ Misses 638 613 -25
TBD containers getting the delete tag don't apply it on their files - no explicit need though.
I don't see a problem until we move them to their own collection. Search will properly remove them.
in hindsight, would we rather want {'deleted': {'$ne': True}}
I think we would remove the key if we "un-delete" something, but curious if anyone else has an opinion on the matter.
@nagem Thanks for all the help and the input. Implemented and tested all features requested for the tagging phase (including permission handling and analysis ref checks). Sticking with {'exists': ...}
because it's not just binary, I'm actually setting it to a datetime
.
Also freshly rebased.
@nagem @ryansanford Rebased, patched, ready for review.
Changes look good, holding on merge for Ryan's review of "restore" script.
@ryansanford added undelete.py
and manually (and very limitedly) tested that container undel/recursion/file undel work.
I'm going to add a new access log type (delete_container) and add it where appropriate here.
I suggest undelete.py recursively undelete parent containers if they are deleted. Currently it does not, resulting in the restored container referencing parent containers that are not accessible by the API.
Scenario...
Another scenario that may trump the one I just posted above, or make the approach to it more nuanced...
I think it's reasonable for a user to expect that restoring a project is scoped to the single delete event in no.2, and would not end up restoring the acquisition deleted in no.1.
Undoubtably, it's a powerful capability to make it easy to indiscriminately undelete everything under a project. I'm questioning if that fits the administrative use case we were initially targeting.
@tcbtcb @lmperry I think your perspectives are important here.
Trying to undelete a file that exists, but not currently deleted is resulting in an exception. Instead, exit code should be zero with no error message, since the state of the system matches the intended state of running undelete.py.
2017-12-21 15:42:45 scitran.undelete undelete.py 46:INFO Removing "deleted" tag from file acquisitions/5a39375e7053b30020c9e2f7/4374_1_1b.nii.gz...
Traceback (most recent call last):
File "bin/undelete.py", line 59, in <module>
main()
File "bin/undelete.py", line 49, in main
del f['deleted']
KeyError: 'deleted'
Scitran Collections, nor their attached files can be undeleted with the current implementation.
collections
as a valid container type for undeleting of its attachments. This seems like a bug.I'm done with my review.
PSA: Search indexing implementations should be updated as appropriate before this change sees production.
@ryansanford if we end up needing to support that scenario (files deleted in the past don't get restored), I have a few ideas we could use (timestamp set once for all containers, delete timestamp not overwritten).
@ambrussimon I spoke with Thad, and he does prefer the option to make the undelete act on only the containers that were part of the delete in question. A blindly recursive undelete that I initially suggested is not desired.
Phew, I addressed everything discussed, @ryansanford.
--include-parents
suppresses the error, and creates any deleted parents.--always-propagate
for handling parents that got restored via --include-parents
The additional logging changes/tests are added.
I wanted to mention something Ambrus and I discussed offline:
Although collections can be "recovered" by the script, only the collection and it's attachments (files, notes, etc) will be recovered. Sessions and acquisitions will need to be re-added manually. 1) Files, info and notes are the only unique data in a collection, so recovering these items via the collection would be the only way to recover them in case of an unintentional deletion. 2) There are technically restraints in how we reference which acquisitions/sessions belong to a collection that would make it complicated to preserve this in a deleted state. Since no unique data is lost in this kind of deletion, we deemed it unnecessary to support this kind of recovery action.
2.) When deleting files within any container, the access log should not have the file name url encoded so it can be easily used to source the undelete command.
I'll tackle this now by adding the (unencoded) filename to the access log.
@ryansanford Good catch. cont_name
pluralization at fault. Added test revealed that analysis deletion (propagation) and undeletion scope was off - fixed. (+as always, fresh rebase)
@ryansanford Additional file context on access log added.
Resolves #1012
Minor TODOs left, but bulk is ready for 1st pass/quick review for discussion.
TODO/TBD:
{'deleted': {'$exists': False}}
can be used less{'deleted': {'$ne': True}}
?Review Checklist