aiidateam / aiida-core

The official repository for the AiiDA code
https://aiida-core.readthedocs.io
Other
430 stars 186 forks source link

Bug in `aiida.engine.daemon.execmanager.retrieve_files_from_list` #3095

Closed adegomme closed 4 years ago

adegomme commented 5 years ago

I'm getting engine/daemon/execmanager.py", line 464, in retrieve_files_from_list to_append = remote_names.split(os.path.sep)[-depth:] if depth > 0 else [] AttributeError: 'list' object has no attribute 'split'

when I try to retrieve the content of a folder over ssh. It seems that the folder name is turned into a list here: https://github.com/aiidateam/aiida_core/blob/e232f946e2b5c1c55c8f8b2c903f05355fe9deea/aiida/engine/daemon/execmanager.py#L463 and then immediately fed to split, which only operates on strings. Removing the [] around remote_names seems to allow it to go further. Is this correct ?

sphuber commented 5 years ago

Thanks for the report. This definitely seems like a bug. Just for reference, what are the contents of your the retrieve_list? Can you please lode the calculation node that is excepting in a shell and report the output of the following:

node = load_node(<pk>)  # Replace the pk here with the pk of the excepted calculation
print(node.get_retrieve_list())
print(node.get_retrieve_temporary_list())
adegomme commented 5 years ago

retrieve list was [u'log.yaml', u'time.yaml', u'forces', u'final', [u'./debug', u'bigdft-err*', 1], u'_scheduler-stdout.txt', u'_scheduler-stderr.txt'] (the debug folder fails)

I'm able to avoid entering this branch by putting the wildcards in the first item of the tuple, as stated in the documentation ( ["./debug/bigdft-err*",".",1] ), but the bug is still there in the code, I'm just avoiding it.

sphuber commented 5 years ago

Thanks for the update. You are right that moving the wildcard to the first element will avoid the bug because then you do not hit the branch in the if transport.has_magic(tmp_rname): branch. However, the second element should not contain wildcards.

The format of the retrieve_list is not particularly well documented, I admit, but it should adhere to the following rules, as per the aiida.common.datastructures.CalcInfo docstring:

`retrieve_list`: a list of strings or tuples that indicate files that are to be retrieved from the remote
after the calculation has finished and stored in the repository in a FolderData.
If the entry in the list is just a string, it is assumed to be the filepath on the remote and it will
be copied to '.' of the repository with name os.path.split(item)[1]
If the entry is a tuple it is expected to have the following format

    ('remotepath', 'localpath', depth)

If the 'remotepath' is a file or folder, it will be copied in the repository to 'localpath'.
However, if the 'remotepath' contains file patterns with wildcards, the 'localpath' should be set to '.'
and the depth parameter should be an integer that decides the localname. The 'remotepath' will be split on
file separators and the local filename will be determined by joining the N last elements, where N is
given by the depth variable.

Example: ('some/remote/path/files/pattern*[0-9].xml', '.', 2)

Will result in all files that match the pattern to be copied to the local repository with path

    'files/pattern*[0-9].xml'

So the directive [u'./debug', u'bigdft-err*', 1] is not valid. The first element should refer to the remote file path. I assume, the remote folder will contain multiple bigdft-err* files that you would like to retrieve. This should be then the first element. Per the docstring, the second element should then be .. Most likely then, your directive should be ('bigdft-err*', '.', 0).

adegomme commented 5 years ago

Indeed, I agree in the wildcard case I was trying, and this will be enough for me to go on from now. But that doesn't change the fact that the other branch creates a list and immediately tries to split() it (if depth>0), which will never work. It crashes if I set ["a/b", "c", 1], as well.

sphuber commented 5 years ago

Indeed, I agree in the wildcard case I was trying, and this will be enough for me to go on from now. But that doesn't change the fact that the other branch creates a list and immediately tries to split() it (if depth>0), which will never work. It crashes if I set ["a/b", "c", 1], as well.

Absolutely, I will keep this ticket open so a patch can be submitted. Thanks again for the report

sphuber commented 5 years ago

By the way, for posterity and a laugh: you have unearthed a bug that was introduced in this commit and has been there for 6 years, since version v0.2.1. Well done :smile:

sphuber commented 4 years ago

I actually fixed this "accidentally" in PR #4196 . In that PR I removed all remaining files from the pre-commit exclude list, among which was the file that causes the bug describes here. The bug was actually found by pylint and so I fixed it. Of course I didn't think straight away of this issue but only just now it snapped into place. Shows the value of linting and how it can make up (a bit) for bad test coverage. The changes of #4196 will be merged in the next release so I am closing this. I found another bug in the same branch of code, that is due to another reason, for which I have opened #4273

Edt: oh dear lord, as it turns out, in the ultimate irony, the change in #4196 that fixed this bug, is the very reason of #4273 's existence. Time to add some badly needed tests there I guess.