Closed CharlieLaughton closed 8 years ago
Thanks for opening the sandbox! Alas, access to /work/e290/e290/e290cl/
is denied, could you please open that for read/exec, too (and the other intermediate directories I assume), please? Thanks!
This seems to have been because of some older modules being used. I have made the change now.
I was looking at the jenkins runs for the status. Surprisingly, it shows success when the runs actually failed ! http://ci.radical-project.org/job/ExTASY-0.2/ I think it might have to do with the error codes when the pilot fails.
older modules - numpy/1.8.0 and scipy/0.13.* on archer. which have been removed.
But not sure why the compute unit error reads "ERROR : ComputeUnit error: STDERR: unit stderr contains binary data -- use file staging directives, STDOUT: unit stderr contains binary data -- use file staging directives"
This is the STDERR of the failing CU:
pc-numpy(3):ERROR:105: Unable to locate a modulefile for 'pc-numpy/1.8.0-libsci'
pc-scipy/0.13.3-libsci(11):ERROR:151: Module 'pc-scipy/0.13.3-libsci' depends on one of the module(s) 'ðÌ'
pc-scipy/0.13.3-libsci(11):ERROR:102: Tcl command execution failed: prereq pc-numpy/1.8.0-libsci
pc-coco/0.25(8):ERROR:151: Module 'pc-coco/0.25' depends on one of the module(s) 'pc-numpy/1.9.2-mkl-python3 pc-numpy/1.9.2-mkl pc-numpy/1.9.2-libsci'
pc-coco/0.25(8):ERROR:102: Tcl command execution failed: prereq pc-numpy
aprun: file pyCoCo not found
Vivek, have a look at the end of line 2:
depends on one of the module(s) 'ðÌ'
That looks.... unexpected...
Yea.. that's an error from archer. I think its trying to print a module (name) which doesn't exist.
Yes, that is my assumption as well. Is it possible to reproduce the error message on command line, using the same 'module load' commands? Would it be useful to get feedback from the archer admins?
Yes, it can be reproduced on command line. Only when we try to load the old modules. Do you think that is the cause of the reported error (ERROR : ComputeUnit error: STDERR: unit stderr contains binary data -- use file staging directives, STDOUT: unit stderr contains binary data -- use file staging directives)
Do you think that is the cause of the reported error
Yes - that output looks very binary to me. I would advise opening a ticket...
This should be fixed now. I don't see the binary output on Archer.
I am trying to run the Amber/CoCo job described in the tutorial on Archer. Except for setting my username and account I have touched nothing. The job fails with the following messages:
I have made the pilot job folder on Archer publicaly accessible: