aiidateam / aiida-sssp-workflow

sssp verification workflows
MIT License
4 stars 2 forks source link

reintrospect clean policy to avoid too many inode on remote #146

Closed unkcpz closed 2 years ago

unkcpz commented 2 years ago

fixes #144

It is very delegate of how to set the remote folder clean in pseudopotential verification, since the verification consist of workflows with lots of sub processes and create huge amount of remote files, although the single calculations are small and resource non-stringent. If the work chains are not cleaned, the number of remote files will increase rapidly and disk quota in remote machine will soon being run out. On the contrast, if clean too early would make caching machanism not used and therefore waste resource for same calculations. To trait off above two side of contradiction, the clean policy is designed as:

This partially solve the caching issue of bands calculation (issue #138. But same as ph.x calculation, is subsequent step failed and previous step cleaned, it will still failed to continue. In phonon, the workaround is to check if the remote folder is empty. This has the expense of even the ph.x calculation finished ok, the previous step clean will force the scf prepare calculation to run again so to sure the ph.x get its parent_folder not empty.

The clean policy of big verification work chain is controlled by test_mode. It will do clean as described above or do not clean anything so it can be checked afterwards.

PR include two major change.

  1. The clean policy as described above
  2. using update_dict as deepcopy for all dict copy in workflow but not the AttributesDict which is exposed_inputs and will clone the data nodes that is not the expected behaviour.