Closed eimrek closed 8 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
90cd4c8
) 99.89% compared to head (ac10458
) 99.90%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks @eimrek . To me the concept looks fine. I haven't gone in detail over the actual intricacies of the order in which you copy data such that it allows this operation during operation. I think it would be good to document the reasoning at least somewhere (either in the docs or in the docstrings of the backup util functions) to explain why this should be safe. The CLI command help should also explicitly state if it is safe to perform while the OS is being used.
Code duplication w.r.t. https://github.com/aiidateam/aiida-core/pull/6069: i basically added the same utility functions to run commands on local and remote machines & to call rsync. Is this duplication fine or do you have some ideas how to change this design?
Sure there is duplication now, but if we were to merge the code here and make a release, aiida-core
could simply reuse the disk_objectstore.backup_utils
module, couldn't it? You could directly use those functions in the PsqlDosBackend.backup
method, so I don't see how you would have to duplicate the code.
We say windows is supported for this repo (in the README.md), but I think relying on the unix commands breaks this. e.g. i'm using [ -e
] to check if a folder exists. I used this so that I could use the same method for both local and remote paths. However, another option is to use python-specific checks for local paths (e.g. pathlib.Path.exists() for this case) and unix-specific ones for remote ones. This might be a way to keep the windows support, at least if doing backups to a local folder.
This will definitely break, since that is a bash command, which will not run natively on windows. But even if you were to replace this and find a cross-platform solution, you are still using rsync
which also doesn't natively run on Windows. So I would say that with this approach, a cross-platform solution will be very difficult. I personally would be fine to just add a disclaimer to the CLI that the backup command is Linux/MacOS only. Don't think it will be very easy to find a cross-platform solution that is as performant and I doubt that there are currently a lot of Windows users that we should cater to.
@giovannipizzi thoughts?
@eimrek let me know if you want me to have a look at this and or have particular questions. Since you marked it as WIP I haven't gone through it yet
@sphuber sure. i think the main features are there, i just need to add tests.
Note after discussion with @edan-bainglass : his usecase (mounting a windows folder to a linux container and backing up there) might not support symlinks. At least in my case, where to mount a windows folder to my linux virtual machine doesn't, by default. Therefore, it might make sense to avoid using symlinks altogether.
Hi @sphuber @giovannipizzi, this PR is in a state where your feedback would be great. the functionality is there and i added main CLI tests. But some notes/comments regarding things i'm not sure about:
Alright, this is finally finalized. I changed the design such that there is a "metaclass" BackupManager
that contains all relevant config options and utilities (e.g. calling rsync and other commands) and folder management (using correct rsync link-dest, making backup to a live folder and then moving to final destination when done, etc). The most important method is backup_auto_folders(backup_func: Callable)
, which argument is the actual backup function and will be different for the disk-objectstore
container and aiida-core storage. Will adapt https://github.com/aiidateam/aiida-core/pull/6069 accordingly ASAP. I also added a docs page for the backups.
Any suggestions welcome :)
@sphuber @giovannipizzi feel free to submit your review if you have a moment to check.
Hi @giovannipizzi @sphuber, as discussed (also in #20), I moved the disk-objectstore backup functionality from https://github.com/aiidateam/aiida-core/pull/6069 to this project.
It works currently, but I added everything currently in a separate file
backup_utils.py
, to keep it a bit decoupled for now. Also requires some polishing (e.g. the logging part).However, there are some issues:
1) Code duplication w.r.t. https://github.com/aiidateam/aiida-core/pull/6069: i basically added the same utility functions to run commands on local and remote machines & to call rsync. Is this duplication fine or do you have some ideas how to change this design? 2) We say windows is supported for this repo (in the README.md), but I think relying on the unix commands breaks this. e.g. i'm using
[ -e <path> ]
to check if a folder exists. I used this so that I could use the same method for both local and remote paths. However, another option is to use python-specific checks for local paths (e.g.pathlib.Path.exists()
for this case) and unix-specific ones for remote ones. This might be a way to keep the windows support, at least if doing backups to a local folder.Any ideas welcome.
If this approach works, i can polish it further and add some tests.