radical-cybertools / ExTASY

MDEnsemble
Other
1 stars 1 forks source link

gromacs/lsdmap on ARCHER issue #56

Closed ebreitmo closed 9 years ago

ebreitmo commented 9 years ago

On ARCHER I get this error

(myExtasy)ebreitmo@eslogin005:~> .local/bin/extasy --RPconfig /home/e290/e290/ebreitmo/ExTASY/config/RP_config.py --Kconfig /home/e290/e290/ebreitmo/ExTASY/config/gromacs_lsdmap_config.py Session UID: 542972d5a7ed1d11a77fc1e0 Pilot UID : 542972d8a7ed1d11a77fc1e2 Loading kernel configurations from /home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/sleep.json Loading kernel configurations from /home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/coco.json Loading kernel configurations from /home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/gromacs.json Loading kernel configurations from /home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/mmpbsa.json Loading kernel configurations from /home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/lsdmap.json Loading kernel configurations from /home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/namd.json Loading kernel configurations from /home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/amber.json Loading kernel configurations from /home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/test.json Prepare grofiles.. Traceback (most recent call last): File ".local/bin/extasy", line 9, in load_entry_point('radical.ensemblemd.extasy==0.1', 'console_scripts', 'extasy')() File "/home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical.ensemblemd.extasy-0.1-py2.7.egg/radical/ensemblemd/extasy/bin/runme.py", line 127, in main Preprocessing(Kconfig,umgr,i) File "/home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical.ensemblemd.extasy-0.1-py2.7.egg/radical/ensemblemd/extasy/bin/Preprocessor/Gromacs/preprocessor.py", line 32, in Preprocessing grofile_obj = gro.GroFile(os.path.dirname(os.path.realpath(file)) + '/' + grofile_name) File "/home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical.ensemblemd.extasy-0.1-py2.7.egg/radical/ensemblemd/extasy/bin/Preprocessor/Gromacs/gro.py", line 11, in init self.natoms=self.get_natoms() File "/home4/e290/e290/ebreitmo/.local/lib/python2.7/site-packages/radical.ensemblemd.extasy-0.1-py2.7.egg/radical/ensemblemd/extasy/bin/Preprocessor/Gromacs/gro.py", line 17, in get_natoms natoms=int(linecache.getline(self.filename, 2)) ValueError: invalid literal for int() with base 10: ''

Looks like some input file issue?!

vivek-bala commented 9 years ago

Hmm.. Could you put the contents of gromacs_lsdmap_config.py ? I've got this error sometimes when the file is not there as defined in the config file. Could you check pls ?

vivek-bala commented 9 years ago

Also gromacs on Stampede is not functional since gromacs 4.6 since is not available on the compute nodes on Archer. Only gromacs 5.0 is available and that expects some additional parameters.

ebreitmo commented 9 years ago

Here is gromacs_lsdmap_config.py as I use it on ARCHER:

author = 'ebreitmo'

--------------------------General--------------------------------

num_CUs = 64 #num of CUs
num_iterations = 4 start_iter = 0 nsave = 2

--------------------------Simulation--------------------------------

input_gro_loc = '$HOME/ExTASY/gromacs_lsdmap_example' input_gro = 'input.gro'

grompp_loc = '$HOME/ExTASY/gromacs_lsdmap_example' grompp_name = 'grompp.mdp'

topol_loc = '$HOME/ExTASY/gromacs_lsdmap_example' topol_name = 'topol.top'

ndxfile_loc = '' ndxfile_name = ''

grompp_options = '' mdrun_options = ''

itpfile_loc = ''

tmp_grofile = 'tmp.gro'

--------------------------Analysis----------------------------------

lsdm_config_loc = '$HOME/ExTASY/gromacs_lsdmap_example' lsdm_config_name = 'config.ini'

system_name = 'out'

outgrofile_name = '%s.gro' %system_name egfile = '%s.eg' % system_name evfile = '%s.ev' % system_name nearest_neighbor_file = '%s.nn' %system_name

num_runs = 10000 num_clone_files = '%s.nc' % system_name

recovery_flag = 0

wfile = 'weight.w'

max_alive_neighbors = '' max_dead_neighbors = ''

vivek-bala commented 9 years ago

I gave it a try as well. I get the same error while using '$HOME' in the paths to each of the input files. Replacing them with the entire path (e.g. /home/vivek/ExTASY/config) without '$HOME' worked. Could you try with the full paths for each of the input files as well ?

oleweidner commented 9 years ago

Please test this again following the updated README in 'devel'. Configuration files have changed. However, $HOME is indeed not supported, please use either the fill path, or, like in the updated examples, a relative path, i.e., "./md.in".

antonst commented 9 years ago

Just tried, results in a bunch of CU's failing, e.g.:

Session UID: 542b0074198220453136d838 Pilot UID : 542b0075198220453136d83a [Callback]: ComputePilot '542b0075198220453136d83a' state changed to Launching. Loading kernel configurations from /tmp/ExTASY-tools/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/coco.json Loading kernel configurations from /tmp/ExTASY-tools/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/namd.json Loading kernel configurations from /tmp/ExTASY-tools/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/gromacs.json Loading kernel configurations from /tmp/ExTASY-tools/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/test.json Loading kernel configurations from /tmp/ExTASY-tools/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/sleep.json Loading kernel configurations from /tmp/ExTASY-tools/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/lsdmap.json Loading kernel configurations from /tmp/ExTASY-tools/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/mmpbsa.json Loading kernel configurations from /tmp/ExTASY-tools/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/amber.json Prepare grofiles.. Preprocessing time : 0.458698034286 Cycle 0 Starting Simulation [Callback]: ComputeUnit '542b007b198220453136d85f' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d864' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d841' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d84c' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d879' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d855' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d878' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d856' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d842' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d865' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d843' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d877' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d875' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d863' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d857' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d871' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d866' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d844' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d873' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d858' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d860' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d867' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d854' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d84d' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d845' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d86a' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d859' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d868' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d852' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d870' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d846' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d86f' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d83d' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d874' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d86e' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d85a' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d87b' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d84f' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d847' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d869' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d84e' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d85b' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d86c' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d83e' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d848' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d872' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d850' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d851' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d83f' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d862' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d83c' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d849' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d87a' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d85d' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d861' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d876' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d86b' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d840' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d84a' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d85e' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d86d' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d84b' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d85c' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007a198220453136d853' state changed to PendingInputStaging. [Callback]: ComputeUnit '542b007b198220453136d85f' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d864' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d860' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d85f' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d867' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d854' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d864' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d84d' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d860' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d845' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d86a' state changed to StagingInput. [Callback]: ComputePilot '542b0075198220453136d83a' state changed to PendingActive. [Callback]: ComputeUnit '542b007b198220453136d867' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d859' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d868' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d854' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d852' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d870' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d84d' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d846' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d86f' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d83d' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d845' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d874' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d86a' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d86e' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d85a' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d859' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d87b' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d84f' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d868' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d847' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d869' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d84e' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d852' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d85b' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d870' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d86c' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d83e' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d846' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d848' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d872' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d86f' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d850' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d851' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d841' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d83d' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d84c' state changed to StagingInput. [Callback]: ComputePilot '542b0075198220453136d83a' state changed to Active. [Callback]: ComputeUnit '542b007b198220453136d879' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d85f' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d864' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d85f' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d841' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d864' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d855' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d841' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d841' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d860' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d860' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d867' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d867' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d854' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d854' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d84d' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84d' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d845' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d845' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d86a' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d86a' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d859' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d859' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d868' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d868' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d852' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d852' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d870' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d870' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d846' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d846' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d86f' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d86f' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d83d' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d83d' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d878' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d84c' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d856' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d84c' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84c' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d842' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d879' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d865' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d879' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d879' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d843' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d855' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d877' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d855' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d855' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d875' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d85f' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d878' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d863' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d878' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d878' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d857' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d856' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d871' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d856' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d856' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d866' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d842' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d844' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d842' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d842' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d873' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d865' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d858' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d865' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d865' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d83f' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d855' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d843' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d862' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d843' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d843' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d83c' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d877' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d849' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d877' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d877' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d87a' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d875' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d85d' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d875' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d875' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d861' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d863' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d876' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d863' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d86b' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d845' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d863' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d857' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d840' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d857' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84a' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d871' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d85e' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d871' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d86d' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d866' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d84b' state changed to StagingInput. [Callback]: ComputeUnit '542b007b198220453136d866' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d85c' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d844' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d853' state changed to StagingInput. [Callback]: ComputeUnit '542b007a198220453136d844' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d878' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d866' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d873' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d858' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d83f' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d862' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d83c' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d849' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d873' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d87a' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d85d' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d858' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d861' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d83f' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d876' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d862' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d86b' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d840' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d83c' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84a' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d849' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d85e' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d86d' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d87a' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84b' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d85d' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d85c' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d853' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d861' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d874' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d876' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d86e' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d85a' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d86b' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d87b' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d840' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84f' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d84a' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d847' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d869' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d85e' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84e' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d86d' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d85b' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d86c' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d84b' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d83e' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d85c' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d848' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d853' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d872' state changed to PendingExecution. [Callback]: ComputeUnit '542b007a198220453136d850' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d874' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d851' state changed to PendingExecution. [Callback]: ComputeUnit '542b007b198220453136d86e' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d85a' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d87b' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84f' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d847' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d869' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d84e' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d85b' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d86c' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d83e' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d848' state changed to Scheduling. [Callback]: ComputeUnit '542b007b198220453136d872' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d850' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d851' state changed to Scheduling. [Callback]: ComputeUnit '542b007a198220453136d846' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d849' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d863' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d876' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d83d' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d87a' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d856' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d84e' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d875' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d869' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d869' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d86c' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d86c' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d83c' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d877' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d872' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d876' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007b198220453136d862' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d862' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d85e' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d842' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d83f' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d83f' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d850' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d849' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d848' state changed to Executing. [Callback]: ComputeUnit '542b007b198220453136d872' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d85d' state changed to Executing.

antonst commented 9 years ago

This happens for as long as Pilot is active, then you get: [Callback]: ComputeUnit '542b007a198220453136d844' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d844' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d840' state changed to Executing. [Callback]: ComputeUnit '542b007a198220453136d840' state changed to Failed. Log: All FTW Input Staging Directives done - 1. [Callback]: ComputeUnit '542b007a198220453136d84f' state changed to Executing. [Callback]: ComputePilot '542b0075198220453136d83a' state changed to Done. [Callback]: ComputeUnit '542b007b198220453136d864' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d841' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d84c' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d879' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d857' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d871' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d866' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d873' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d858' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d87a' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d861' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d86b' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d84a' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d85c' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d860' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d867' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d854' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d84d' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d86a' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d859' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d868' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d852' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d870' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d874' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d85a' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d87b' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d84f' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d847' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d84e' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d83e' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d850' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d851' state changed to Canceled. [Callback]: ComputeUnit '542b007b198220453136d865' state changed to Canceled. [Callback]: ComputeUnit '542b007a198220453136d843' state changed to Canceled. Traceback (most recent call last): File "/tmp/ExTASY-tools/bin/extasy", line 9, in load_entry_point('radical.ensemblemd.extasy==0.1', 'console_scripts', 'extasy')() File "/tmp/ExTASY-tools/local/lib/python2.7/site-packages/radical/ensemblemd/extasy/bin/runme.py", line 130, in main Simulator(umgr,RPconfig,Kconfig,i) File "/tmp/ExTASY-tools/local/lib/python2.7/site-packages/radical/ensemblemd/extasy/bin/Simulator/Gromacs/simulator.py", line 64, in Simulator with open('out%s.gro' % i, 'r') as output_file: IOError: [Errno 2] No such file or directory: 'out0.gro'

vivek-bala commented 9 years ago

43

vivek-bala commented 9 years ago

This should be fixed in the latest commit in devel. Please test again.

ebreitmo commented 9 years ago

I try to launch the job on ARCHER from my Mac. It seems to be a DB issue, as for CoCo/Amber (issue #68)

AGENT.STDERR: Permission denied (publickey,keyboard-interactive). 2014:10:06 10:15:28 MainThread radical.pilot.agent : [INFO ] RADICAL-Pilot multi-core agent for package/API version 0.20 2014:10:06 10:15:28 MainThread radical.pilot.agent : [INFO ] Using SAGA ver sion 0.15 2014:10:06 10:15:28 MainThread radical.pilot.agent : [ERROR ] Couldn't estab lish database connection: [Errno 111] Connection refused kill: 9747: No such process

Traceback (most recent call last): File "/Users/elenabreitmoser/020114/bin/extasy", line 8, in load_entry_point('radical.ensemblemd.extasy==0.1', 'console_scripts', 'extasy')() File "/Users/elenabreitmoser/020114/lib/python2.7/site-packages/radical/ensemblemd/extasy/bin/runme.py", line 130, in main Simulator(umgr,RPconfig,Kconfig,i) File "/Users/elenabreitmoser/020114/lib/python2.7/site-packages/radical/ensemblemd/extasy/bin/Simulator/Gromacs/simulator.py", line 64, in Simulator with open('out%s.gro' % i, 'r') as output_file: IOError: [Errno 2] No such file or directory: 'out0.gro'

andre-merzky commented 9 years ago

The most likely reason is that ssh connection from archer headnode to the archer computes fail -- see FAQ for a solution. Would you mind giving that a try and report back if the error persists?

Thanks!

ebreitmo commented 9 years ago

I did everything again from scratch (Mac to ARCHER, gromacs/lsdmap), see https://gist.github.com/ebreitmo/6e596b0d38929e5c4b9c for errors.

Locally I have

ls -lrt total 49288 -rw-r--r-- 1 elenabreitmoser staff 599 6 Oct 09:28 archer.rcfg -rw-r--r-- 1 elenabreitmoser staff 2218 6 Oct 09:29 gromacslsdmap.wcfg -rw-r--r-- 1 elenabreitmoser staff 750 6 Oct 09:32 gromacslsdmap.wcfgc -rw-r--r-- 1 elenabreitmoser staff 414 6 Oct 09:32 archer.rcfgc -rw-r--r-- 1 elenabreitmoser staff 265 6 Oct 15:35 config.ini -rw-r--r-- 1 elenabreitmoser staff 1409 6 Oct 15:35 grompp.mdp -rw-r--r-- 1 elenabreitmoser staff 10720000 6 Oct 15:35 input.gro -rw-r--r-- 1 elenabreitmoser staff 6674 6 Oct 15:35 topol.top drwxr-xr-x 26 elenabreitmoser staff 884 6 Oct 15:36 temp -rw------- 1 elenabreitmoser staff 657192 6 Oct 15:47 out15.gro -rw------- 1 elenabreitmoser staff 655616 6 Oct 15:54 out21.gro -rw------- 1 elenabreitmoser staff 655616 6 Oct 15:57 out20.gro -rw------- 1 elenabreitmoser staff 657192 6 Oct 15:59 out14.gro -rw------- 1 elenabreitmoser staff 655616 6 Oct 16:05 out16.gro -rw------- 1 elenabreitmoser staff 655616 6 Oct 16:08 out23.gro -rw------- 1 elenabreitmoser staff 655616 6 Oct 16:12 out18.gro -rw------- 1 elenabreitmoser staff 655616 6 Oct 16:15 out22.gro -rw------- 1 elenabreitmoser staff 655616 6 Oct 16:16 out17.gro -rw-r--r-- 1 elenabreitmoser staff 8543496 6 Oct 16:16 tmp.gro

oleweidner commented 9 years ago

Not reproducible. Closed before new round of testing.