Closed adegomme closed 4 years ago
Thanks for the report. This definitely seems like a bug. Just for reference, what are the contents of your the retrieve_list
? Can you please lode the calculation node that is excepting in a shell and report the output of the following:
node = load_node(<pk>) # Replace the pk here with the pk of the excepted calculation
print(node.get_retrieve_list())
print(node.get_retrieve_temporary_list())
retrieve list was [u'log.yaml', u'time.yaml', u'forces', u'final', [u'./debug', u'bigdft-err*', 1], u'_scheduler-stdout.txt', u'_scheduler-stderr.txt'] (the debug folder fails)
I'm able to avoid entering this branch by putting the wildcards in the first item of the tuple, as stated in the documentation ( ["./debug/bigdft-err*",".",1] ), but the bug is still there in the code, I'm just avoiding it.
Thanks for the update. You are right that moving the wildcard to the first element will avoid the bug because then you do not hit the branch in the if transport.has_magic(tmp_rname):
branch. However, the second element should not contain wildcards.
The format of the retrieve_list
is not particularly well documented, I admit, but it should adhere to the following rules, as per the aiida.common.datastructures.CalcInfo
docstring:
`retrieve_list`: a list of strings or tuples that indicate files that are to be retrieved from the remote
after the calculation has finished and stored in the repository in a FolderData.
If the entry in the list is just a string, it is assumed to be the filepath on the remote and it will
be copied to '.' of the repository with name os.path.split(item)[1]
If the entry is a tuple it is expected to have the following format
('remotepath', 'localpath', depth)
If the 'remotepath' is a file or folder, it will be copied in the repository to 'localpath'.
However, if the 'remotepath' contains file patterns with wildcards, the 'localpath' should be set to '.'
and the depth parameter should be an integer that decides the localname. The 'remotepath' will be split on
file separators and the local filename will be determined by joining the N last elements, where N is
given by the depth variable.
Example: ('some/remote/path/files/pattern*[0-9].xml', '.', 2)
Will result in all files that match the pattern to be copied to the local repository with path
'files/pattern*[0-9].xml'
So the directive [u'./debug', u'bigdft-err*', 1]
is not valid. The first element should refer to the remote file path. I assume, the remote folder will contain multiple bigdft-err*
files that you would like to retrieve. This should be then the first element. Per the docstring, the second element should then be .
. Most likely then, your directive should be ('bigdft-err*', '.', 0)
.
Indeed, I agree in the wildcard case I was trying, and this will be enough for me to go on from now. But that doesn't change the fact that the other branch creates a list and immediately tries to split() it (if depth>0), which will never work. It crashes if I set ["a/b", "c", 1], as well.
Indeed, I agree in the wildcard case I was trying, and this will be enough for me to go on from now. But that doesn't change the fact that the other branch creates a list and immediately tries to split() it (if depth>0), which will never work. It crashes if I set ["a/b", "c", 1], as well.
Absolutely, I will keep this ticket open so a patch can be submitted. Thanks again for the report
By the way, for posterity and a laugh: you have unearthed a bug that was introduced in this commit and has been there for 6 years, since version v0.2.1
. Well done :smile:
I actually fixed this "accidentally" in PR #4196 . In that PR I removed all remaining files from the pre-commit exclude list, among which was the file that causes the bug describes here. The bug was actually found by pylint
and so I fixed it. Of course I didn't think straight away of this issue but only just now it snapped into place. Shows the value of linting and how it can make up (a bit) for bad test coverage. The changes of #4196 will be merged in the next release so I am closing this. I found another bug in the same branch of code, that is due to another reason, for which I have opened #4273
Edt: oh dear lord, as it turns out, in the ultimate irony, the change in #4196 that fixed this bug, is the very reason of #4273 's existence. Time to add some badly needed tests there I guess.
I'm getting engine/daemon/execmanager.py", line 464, in retrieve_files_from_list to_append = remote_names.split(os.path.sep)[-depth:] if depth > 0 else [] AttributeError: 'list' object has no attribute 'split'
when I try to retrieve the content of a folder over ssh. It seems that the folder name is turned into a list here: https://github.com/aiidateam/aiida_core/blob/e232f946e2b5c1c55c8f8b2c903f05355fe9deea/aiida/engine/daemon/execmanager.py#L463 and then immediately fed to split, which only operates on strings. Removing the [] around remote_names seems to allow it to go further. Is this correct ?