aiidateam / team-compass

A repository for storing the AiiDA team roadmap
https://team-compass.readthedocs.io
MIT License
0 stars 0 forks source link

Usability: Allow pure folder-based access and storage of AiiDA profiles #22

Open giovannipizzi opened 1 month ago

giovannipizzi commented 1 month ago

Motivation

Even advanced users might not fully understand where data is stored (DB, repository, configuration file, ...). This is currently also virtual-environment based. If I want to know all profiles in my computer, it's hard to keep track of them if I'm note very organised. A user might just want to know that everything about a profile is inside a folder (as it happens for a git repo, everything is inside a .git folder), and if I move the folder, I'm moving everything; if I delete a folder, everything is gone; etc.

Desired Outcome

It is possible to have a way to define a profile that is fully confined in a folder (including all data). This should be easy for the SQLite DB at least (notes in a comment below for PSQL). It should then be easy to just use that profile by just navigating into the folder. In the future, similar to a git checkout, one could have a way to mirror part of the data in the folder, at least in read-only mode, so people can just use usual file browsers, grep commands etc, to understand what the folder is about. Syncing to this folder does not have to be realtime, but can happen when a command is invoked, similar to git pull, e.g. verdi sync .... This could e.g. be in the form of an extended version of the verdi process dump command, that instead dumps all nodes inside a given group. And, when a verdi group sync --all command is run, it refreshes the dump files to ensure they are up to date with the AiiDA DB.

Impact

I think many users have a hard time understanding the concept of profiles, where data is stored, how to delete a profile and back it up (even if we provide commands), what to do when disk storage is running low, etc. Folder based can help a lot in starting with AiiDA while feeling to have a full control of their data.

Complexity

Progress

Being discussed/brainstorming.

giovannipizzi commented 1 month ago

Here are some steps to create a minimal running PSQL in user space, confined in a folder. The idea is that probably we could consider this, at least for the folder-based approach?

The idea of this message here is just to show that it is actually possible to have a folder-based approach even with PSQL, with some caveats.

Here are some steps to create and use a new PSQL DB locally, as a standard users, without ports but just Unix sockets.

STEP 1: create an empty scaffolding folder for PSQL.

As a folder I use pwd for simplicity. it shoudl of course be in a place like ./.aiida/psql_db/

STEP 2: Minimal configuration of the new PSQL instance

I create in it a sockets dir inside the same folder, and make sure that only sockets are used, and that sockets go in the folder just created

mkdir sockets_dir
echo "listen_addresses = ''" >> postgresql.conf
echo "unix_socket_directories = '`pwd`/sockets_dir'" >> postgresql.conf

STEP 3: Use this user-space, folder-based PSQL

I can start, check the status, and stop the PSQL server with these commands.

pg_ctl start -l logfile -D `pwd`
pg_ctl -D `pwd` status
pg_ctl -D `pwd` stop

Further notes:

giovannipizzi commented 1 month ago

pinging @mbercx since we discussed this today, @sphuber since we discussed this in the past, and also others like @unkcpz @khsrali @GeigerJ2 @agoscinski

GeigerJ2 commented 1 month ago

Thanks, @giovannipizzi, for the detailed write-up! Some preliminary notes:

giovannipizzi commented 1 month ago

Thanks for the comments! Just a follow up comment ony own comments. What I wrote was just some thoughts and ideas. I'm happy to discuss if, for psql, it's really safe to put all in a folder. Maybe it creates more problems if people start to move the folder while the DB is running etc. To be discussed

mbercx commented 4 weeks ago

Thanks @giovannipizzi! Just for context, I'm putting the verdi init PR here, which implemented a git-like folder discovery for the .aiida folder:

https://github.com/aiidateam/aiida-core/pull/6315

As well as my rather extensive objections to this approach.

Just writing down my thoughts quickly, on the train and only have 5 mins. ^^

  1. Is there any reason why we would prefer .aiida-folder discovery over profile-via-folder discovery?
  2. One way I envision this to work is to give the user the option (perhaps literally via an option, but perhaps also as a different storage backend) to create a "localized" or "contained" profile in a folder. This would write a specific file to the top level of that folder (e.g. .aiida_profile) that we could then use to implement git-like directory-based profile discovery. I.e. the precedence would be:

    a. Profile specified in command via -p option. b. Folder-based discovery of the .aiida_profile file. c. Default configured profile in the .aiida directory.

giovannipizzi commented 3 weeks ago

Hi, I have no objection of using a different file/folder name for this (instead of .aiida) - but I'd need to rediscuss why this is a problem (ore read again your objections if it's explained there). But for 1, I think it's just more intuitive for people who just want to work in parallel with multiple profiles. I think that at the moment, most people just use 1 profile because switching is not trivial (it takes a lot of time to setup correctly, and you need to use different terminals for each). With folder-based, we mirror git: no need to open a new terminal, just change folder (even a subfolder) and everything will apply to the new repository. Very simple, intuitive, and people are used to this with git and other tools, so shouldn't be a surprising behaviour. And no need for complex setup.

It comes of course with implications on supporting working on various profiles even if created with different AiIDA versions without automated profile file changes etc. But I think it's OK, we can probably even commit to not making any migration within major versions, as well as having backward compatible profile files within major versions (and anyway avoiding automatic migration of those, but wanting users that they are using a profile of a old - or too new - AiiDA version, with suggestions on how to proceed).