Closed euhruska closed 5 years ago
failed at launch
leonardo.rice.edu using gpus {'queue': 'high', 'resource': 'ncsa.bw_aprun', 'cpus': 800, 'project': 'bamm', 'access_schema': 'gsissh', 'gpus': 50, 'walltime': 2880} shared_data_all ['settings_extasy_tica3_chignolin_long_cmicro.wcfg', './files-chignolin//system-5.xml', './files-chignolin//integrator-5.xml', './files-chignolin/chignolin-ca-crystal.pdb', './inp_files/run-ope nmm-xml3.py', './files-chignolin/chignolin.pdb', './helper_scripts/run-tica-msm5.py', './helper_scripts/analyze3.py'] 2019-02-27 18:28:18,848: radical.entk.resource_manager.0000: MainProcess : MainThread : INFO : Pilot pilot.0000 state: PMGR_LAUNCHING_PENDING 2019-02-27 18:28:18,852: radical.entk.resource_manager.0000: MainProcess : pmgr.0000.subscriber._state_sub_cb: INFO : Pilot pilot.0000 state: PMGR_LAUNCHING 2019-02-27 18:28:25,981: radical.utils : pmgr.0000.launching.0 : MainThread : DEBUG : lm create pool for gsisftp://bw.ncsa.illinois.edu/shell_file_adaptor_command_shell/ () () 2019-02-27 18:28:25,981: radical.utils : pmgr.0000.launching.0 : MainThread : DEBUG : lm create object for gsisftp://bw.ncsa.illinois.edu/shell_file_adaptor_command_shell/ 2019-02-27 18:28:33,628: radical.entk.resource_manager.0000: MainProcess : pmgr.0000.subscriber._state_sub_cb: INFO : Pilot pilot.0000 state: FAILED 2019-02-27 18:28:33,629: radical.entk.resource_manager.0000: MainProcess : pmgr.0000.subscriber._state_sub_cb: ERROR : Pilot has failed
Did that setup work before - if so, what changed? Can you please attach both client and pilot sandboxes?
Thanks!
ugh, I added to the bashrc for another project. Removing this fixed the issue
failed at launch