aiidateam / team-compass

A repository for storing the AiiDA team roadmap
https://team-compass.readthedocs.io
MIT License
0 stars 0 forks source link

Usability: Make it easier to backup and restore full AiiDA installations #11

Open ramirezfranciscof opened 1 year ago

ramirezfranciscof commented 1 year ago

Motivation

Proper digital data management requires one to keep copies of the information in case of system failures on the main work devices. AiiDA has a well established method for transmitting information between installations by using the verdi archive command to export/import sets of nodes. However, even when selecting to export all nodes in the database, this may leave out information related to the configurations of the working profile. There is some documentation on creating backups, but it is somewhat convoluted and may even have become outdated since the latest modifications in aiida-core. This means there is currently no official recommended procedure for backing up AiiDA installations.

Desired Outcome

Have a clear recommended procedure for backing up and restoring full AiiDA profiles/installations. Add any features and/or utility scripts in aiida-core that can automate some or all of the steps, and review/update respective documentation section.

Impact

All users should benefit from improved backup procedures.

Complexity

Originally creating the backup just required 3 steps:

  1. Dumping the Postgres database
  2. Copying the file repository folder
  3. Copying the config.json configuration file

Since all of this was performed outside of AiiDA, it is unclear what would happen if this procedure was started while the AiiDA instance was being used (and nodes were created / modified during steps or between them, leading to inconsistent parts). Moreover, the recent changes to include the disk-objectstore (which added another SQLite database inside the file repository) add an extra level of complexity to live backups.

We need to evaluate if we can provide a more streamlined and secure way for users to create backups, perhaps even adding new verdi functionalities to automate one or more of these steps in a safer manner. We must also decide if it is possible to do more modular backups (of single profiles, for example) or if it is too inconvenient to do anything other than full system installation backups.

Finally, this procedure may also need to be re-structured if we implement some pull/push mechanism in the future (or replaced by it altogether).

Extra Notes

Progress

sphuber commented 1 year ago

Since, as of v2.0, it is possible to provide custom storage backends (as for example done by aiida-s3) we should take into account that the method of backing up a core.psql_dos backend is not necessarily always the correct one.

Ideally then, we would define a method on the StorageBackend interface that creates a backup of its contents as well as a method to restore a backend from a created backup. In this way, we can have a single verdi command that automates the entire backing up. It can provide options to backup just the storage of any profile, or backup the entire instance including configuration and log files.

One big challenge will be to have the backup/restore methods of the StorageBackend class be performant and work whenever possible without root access. In the past, we would provide manual instructions for backing up the default storage backend since that was the most efficient, i.e., by directly going to psql to dump the database and using rsync for the file repository.

ramirezfranciscof commented 1 year ago

One big challenge will be to have the backup/restore methods of the StorageBackend class be performant and work whenever possible without root access.

Why do you mention this specifically? I would agree that one should try to do as much as possible without root access, but if it is necessary the user should just be prompted for password when running the command.

sphuber commented 1 year ago

Why do you mention this specifically? I would agree that one should try to do as much as possible without root access, but if it is necessary the user should just be prompted for password when running the command.

For the same reason that users often experience problems using verdi quicksetup if they don't have root access. Users on these platforms won't be able to make backups if it requires root access and they don't have it.

ramirezfranciscof commented 1 year ago

Yeah, good point, I forget that users may not have root access in their workstation...

chrisjsewell commented 1 year ago

Heya, I would suggest a possible alternative/complimentary solution here, is to provide functionality to "sync" backend instances. This is effectively what you are doing now when you create/import an archive (since v2 archives are effectively just an instance of a sqlite_zip backend), the limitation at the moment being that you can only create "full" archives, as opposed to having incremental updates.

If you could, for example, sync a "local" psql_dos backend with a "remote" aiida-s3) backend, then you have a backup.

This obviously relates also to https://github.com/aiidateam/aiida-core/issues/4535

In terms of also syncing, configuration and log file, that would be an open question. I think there is already an open issue(s) about including the configuration in the archive

chrisjsewell commented 1 year ago

(my suggestion ☝️ is somewhat alluded to in the initial issue, but I wanted to make it more concrete)

giovannipizzi commented 6 months ago

@eimrek @sphuber this can be closed now?

sphuber commented 6 months ago

I guess the backup part is there, but it stands to be argued that restoring can be made a lot easier.