Open yarikoptic opened 4 hours ago
Hello, I tried to solve the issue.
This is what I did:
Modified main.rs to add support for an external tool that decides which keys to backup. Added a new command-line argument for the external tool, implemented a function to communicate with the tool, and integrated it into the main logic.
You can review changes in this commit: https://github.com/lattaai13/dandi-s3invsync-2/commit/50adaa9b7767b17db791b57bf1ee265739cdc859.
[!CAUTION] Disclaimer: The commit was created by Latta AI and you should never copy paste this code before you check the correctness of generated code. Solution might not be complete, you should use this code as an inspiration only.
This issue was tried to solve for free by Latta AI - https://latta.ai/ourmission
If you no longer want Latta AI to attempt solving issues on your repository, you can block this account.
Add support for the tool to be paired with an (external) tool to inform the tool on either any particular key should be backed up.
The immediate usecase for DANDI is to cater to use-case @satra would like to achieve that we would avoid backing up old known to be not needed huge
.h5
files for 000108 (IIRC), or overall -- do not bother backing up not actively used/references assets (which I would prefer to avoid -- too much possibility for a bug). So, it would be great to be able to explicitly filter out some downloads based on some external domain-specific knowledge.Inspired by
git annex CMD --batch
mode, I envision"key"
and"versionId"
(if known) to the externalCMD
via stdinCMD
in turn in stdout responds with a json record on the "action" which could benull
- no opinion, up to the s3invsync"backup"
- do ensure local presence"skip"
- do not backup"kill"
- do not backup and ensure that no backup of this key/version locally exists,"kill-all-versions"
- do not backup and ensure that no backup of this key of any version locally exists.Last two might be important for the use cases where users demand that we destroy all copies of their data