aiidateam / aiida-quantumespresso

The official AiiDA plugin for Quantum ESPRESSO
https://aiida-quantumespresso.readthedocs.io
Other
55 stars 80 forks source link

Use symlinks for pw2wannier90 input #618

Open greschd opened 3 years ago

greschd commented 3 years ago

Following the discussion in #565, if a code does not modify the input files it gets from the parent files, a symlink can be used instead of copying the files.

I did some preliminary testing on pw2wannier90.x: It temporarily creates aiida.wfc1, ..., aiida.wfcN files in the out directory (the "distributed wave function" files). Other than that, the content of the out folder is not modified.

As such, it seems to me using symlinks is safe, at least for newer QE versions where the wf_collect=False mode is not implemented (see https://www.quantum-espresso.org/Doc/INPUT_PW.html#idm68).

@giovannipizzi @sphuber do you have any ideas for additional checks / tests to make before implementing this?

greschd commented 3 years ago

Implementing the settings['PARENT_FOLDER_SYMLINK'] option for pw2wannier90 would get us mostly there, but as @giovannipizzi mentioned in the discussion of #565, symlinking each file separately would be safer.

giovannipizzi commented 3 years ago

According to your tests, indeed we have to create the out folder by hand, and symlink each file. We cannot symlink the whole folder unfortunately :-( Two pw2wannier90 runs running at the same time (quite possible, maybe you try different projections in parallel) would both write aiida.wfc* files in the out folder of the parent (which is already bad by itself, as it modifies the parent folder; and worse here since the two running pw2wan would overwrite each other's files

greschd commented 3 years ago

Yes, I had just realized the same. I guess this would need some infrastructure on the aiida-core side?

greschd commented 3 years ago

Alright, I think I've got a proof of concept hacked together. In aiida-core, we need to change the symlink method to automatically create directory of the target symlink:

diff --git a/aiida/transports/plugins/ssh.py b/aiida/transports/plugins/ssh.py
index 0ca09c3b4..0ee678218 100644
--- a/aiida/transports/plugins/ssh.py
+++ b/aiida/transports/plugins/ssh.py
@@ -1315,10 +1315,11 @@ class SshTransport(Transport):  # pylint: disable=too-many-public-methods
                 # if there are patterns in dest, I don't know which name to assign
                 raise ValueError('Remotedestination cannot have patterns')

+            self.makedirs(dest, ignore_existing=True)
             # find all files matching pattern
             for this_source in self.glob(source):
                 # create the name of the link: take the last part of the path
-                this_dest = os.path.join(remotedestination, os.path.split(this_source)[-1])
+                this_dest = os.path.join(dest, os.path.split(this_source)[-1])
                 self.sftp.symlink(this_source, this_dest)
         else:
             self.sftp.symlink(source, dest)

And then in aiida-quantumespresso, we just need to add <source_path>/out/* to the symlink list instead of <source_path>/out.

diff --git a/aiida_quantumespresso/calculations/namelists.py b/aiida_quantumespresso/calculations/namelists.py
index 8d981b9..5212fef 100644
--- a/aiida_quantumespresso/calculations/namelists.py
+++ b/aiida_quantumespresso/calculations/namelists.py
@@ -178,18 +178,23 @@ class NamelistsCalculation(CalcJob):
         local_copy_list = []
         remote_symlink_list = []

-        ptr = remote_symlink_list if symlink else remote_copy_list
-
         # copy remote output dir, if specified
         parent_calc_folder = self.inputs.get('parent_folder', None)
         if parent_calc_folder is not None:
             if isinstance(parent_calc_folder, RemoteData):
                 parent_calc_out_subfolder = settings.pop('PARENT_CALC_OUT_SUBFOLDER', self._INPUT_SUBFOLDER)
-                ptr.append((
-                    parent_calc_folder.computer.uuid,
-                    os.path.join(parent_calc_folder.get_remote_path(), parent_calc_out_subfolder),
-                    self._OUTPUT_SUBFOLDER
-                ))
+                if symlink:
+                    remote_symlink_list.append((
+                        parent_calc_folder.computer.uuid,
+                        os.path.join(parent_calc_folder.get_remote_path(), parent_calc_out_subfolder, '*'),
+                        self._OUTPUT_SUBFOLDER
+                    ))
+                else:
+                    remote_copy_list.append((
+                        parent_calc_folder.computer.uuid,
+                        os.path.join(parent_calc_folder.get_remote_path(), parent_calc_out_subfolder),
+                        self._OUTPUT_SUBFOLDER
+                    ))
             elif isinstance(parent_calc_folder, FolderData):
                 for filename in parent_calc_folder.list_object_names():
                     local_copy_list.append((

@giovannipizzi what do you think of this? Any idea how we could get this to work without the change to aiida-core?

Because the NamelistCalculation also affects other calculation types, maybe there should be some way to turn on / off this behavior.

giovannipizzi commented 3 years ago

Thanks @greschd !

A few comments and further suggestions:

What do people think about this suggestion? Pinging a few people @sphuber @chrisjsewell - if we like the idea we should move this to aiida-core and fix it there first