mturilli / aimes.emanager

MIT License
0 stars 1 forks source link

sc14demo: still show auth error for removed hopper #6

Closed Francis-Liu closed 9 years ago

Francis-Liu commented 9 years ago

I have commented out hopper from bundle bundle_demo_SC2014.conf.

Now hopper is not in resource list:

[mypyenv2.sc2014.demo] fengl@aimes (/home/grad03/fengl) % ~/mypyenv2.sc2014.demo/bin/radical-utils-mongodb.py -m tree -d mongodb://54.221.194.147:24242/AIMES_bundle_fengl/
modes   : tree
db url  : mongodb://54.221.194.147:24242/AIMES_bundle_fengl/
AIMES_bundle_fengl
 +-- db   AIMES_bundle_fengl
 | +-- coll config
 | | +-- doc  stampede_tacc_xsede_org
 | | | +-- timestamp
 | | | +-- _id
 | | | +-- queue_info
 | | | +-- num_nodes
 | | +-- doc  blacklight_psc_xsede_org
 | | | +-- timestamp
 | | | +-- _id
 | | | +-- queue_info
 | | | +-- num_nodes
 | | +-- doc  gordon_sdsc_xsede_org
 | | | +-- timestamp
 | | | +-- _id
 | | | +-- queue_info
 | | | +-- num_nodes
 | | +-- doc  trestles_sdsc_xsede_org
 | | | +-- timestamp
 | | | +-- _id
 | | | +-- queue_info
 | | | +-- num_nodes
 | +-- coll workload
 | | +-- doc  stampede_tacc_xsede_org
 | | | +-- development
 | | | +-- normal
 | | | +-- timestamp
 | | | +-- large
 | | | +-- serial
 | | | +-- _id
 | | | +-- largemem
 | | +-- doc  blacklight_psc_xsede_org
 | | | +-- timestamp
 | | | +-- _id
 | | | +-- batch
 | | +-- doc  gordon_sdsc_xsede_org
 | | | +-- timestamp
 | | | +-- _id
 | | | +-- vsmp
 | | | +-- normal
 | | +-- doc  trestles_sdsc_xsede_org
 | | | +-- shared
 | | | +-- _id
 | | | +-- normal
 | | | +-- timestamp
 | +-- coll bandwidth
 | | +-- doc  stampede_tacc_xsede_org
 | | | +-- 54_196_51_239
 | | | +-- _id
 | | | +-- timestamp
 | | +-- doc  blacklight_psc_xsede_org
 | | | +-- 54_196_51_239
 | | | +-- _id
 | | | +-- timestamp
 | | +-- doc  gordon_sdsc_xsede_org
 | | | +-- 54_196_51_239
 | | | +-- _id
 | | | +-- timestamp
 | | +-- doc  trestles_sdsc_xsede_org
 | | | +-- 54_196_51_239
 | | | +-- _id
 | | | +-- timestamp

However, I still got the authentication failure error on hopper:

Pilot submissions
Pilot on resource nersc.hopper......... SUBMITTED to PM 54e39c37033bb25f9a70000c
Pilot on resource xsede.stampede....... SUBMITTED to PM 54e39c37033bb25f9a70000c
Pilot on resource xsede.trestles....... SUBMITTED to PM 54e39c37033bb25f9a70000c
Pilot on resource xsede.blacklight..... SUBMITTED to PM 54e39c37033bb25f9a70000c
Pilot on resource xsede.gordon......... SUBMITTED to PM 54e39c37033bb25f9a70000c

Caught exception: prompted for unknown password (
 *****************************************************************
 *                                                               *
 *                      NOTICE TO USERS                          *
 *                      ---------------                          *
 *                                                               *
 *  Lawrence Berkeley National Laboratory operates this          *
 *  computer system under contract to the U.S. Department of     *
 *  Energy.  This computer system is the property of the United  *
 *  States Government and is for authorized use only.  *Users    *
 *  (authorized or unauthorized) have no explicit or implicit    *
 *  expectation of privacy.*                                     *
 *                                                               *
 *  Any or all uses of this system and all files on this system  *
 *  may be intercepted, monitored, recorded, copied, audited,    *
 *  inspected, and disclosed to site, Department of Energy, and  *
 *  law enforcement personnel, as well as authorized officials   *
 *  of other agencies, both domestic and foreign.  *By using     *
 *  this system, the user consents to such interception,         *
 *  monitoring, recording, copying, auditing, inspection, and    *
 *  disclosure at the discretion of authorized site or           *
 *  Department of Energy personnel.*                             *
 *                                                               *
 *  *Unauthorized or improper use of this system may result in   *
 *  administrative disciplinary action and civil and criminal    *
 *  penalties.  _By continuing to use this system you indicate   *
 *  your awareness of and consent to these terms and conditions  *
 *  of use.  LOG OFF IMMEDIATELY if you do not agree to the      *
 *  conditions stated in this warning._*                         *
 *                                                               *
 *****************************************************************

Password: ) (/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py +297 (_initialize_pty)  :  % match))

End of AIMES SC2014 demo.
================================================================================
Traceback (most recent call last):
  File "/home/grad03/fengl/mypyenv2.sc2014.demo/bin/demo_SC2014_script.py", line 875, in <module>
    pilots = pmgr.submit_pilots(pdescs)
  File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/radical/pilot/pilot_manager.py", line 371, in submit_pilots
    resource_config=resource_cfg)
  File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/radical/pilot/controller/pilot_manager_controller.py", line 406, in register_start_pilot_request
    shell = sup.PTYShell (url, self._session, logger, opts={})
  File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell.py", line 239, in __init__
    posix=self.posix)
  File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py", line 192, in initialize
    self._initialize_pty (info['pty'], info, is_shell=posix)
  File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py", line 395, in _initialize_pty
    raise ptye.translate_exception (e)
saga.exceptions.AuthenticationFailed: prompted for unknown password (
 *****************************************************************
 *                                                               *
 *                      NOTICE TO USERS                          *
 *                      ---------------                          *
 *                                                               *
 *  Lawrence Berkeley National Laboratory operates this          *
 *  computer system under contract to the U.S. Department of     *
 *  Energy.  This computer system is the property of the United  *
 *  States Government and is for authorized use only.  *Users    *
 *  (authorized or unauthorized) have no explicit or implicit    *
 *  expectation of privacy.*                                     *
 *                                                               *
 *  Any or all uses of this system and all files on this system  *
 *  may be intercepted, monitored, recorded, copied, audited,    *
 *  inspected, and disclosed to site, Department of Energy, and  *
 *  law enforcement personnel, as well as authorized officials   *
 *  of other agencies, both domestic and foreign.  *By using     *
 *  this system, the user consents to such interception,         *
 *  monitoring, recording, copying, auditing, inspection, and    *
 *  disclosure at the discretion of authorized site or           *
 *  Department of Energy personnel.*                             *
 *                                                               *
 *  *Unauthorized or improper use of this system may result in   *
 *  administrative disciplinary action and civil and criminal    *
 *  penalties.  _By continuing to use this system you indicate   *
 *  your awareness of and consent to these terms and conditions  *
 *  of use.  LOG OFF IMMEDIATELY if you do not agree to the      *
 *  conditions stated in this warning._*                         *
 *                                                               *
 *****************************************************************

Password: ) (/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py +297 (_initialize_pty)  :  % match))
mv: cannot stat ‘54e39c36033bb25f9a70000b.png’: No such file or directory
Can't stat run-21-54e39c36033bb25f9a70000b/54e39c36033bb25f9a70000b.png: No such file or directory
run-21-54e39c36033bb25f9a70000b/54e39c36033bb25f9a70000b.png: unable to attach file.
andre-merzky commented 9 years ago

Hi Francis,

can you please make doubly sure that you are using the same darabase URL for both, pupulating via bundles and running the actual demo? The demo should not be able to pick up hosts which are not reported in the DB...

Matteo, do you see any other code path how hopper might creep back in again?

mturilli commented 9 years ago

I double checked with the code and it seems to me that hopper should not creep back. This is how the resources are selected (part of it, just their name here):

for resource_name in bundle.resources:
        resource = bundle.resources[resource_name]

In the log of the run, the target resource set is printed. Look for the sting:

Decision D03 based on D02, E02 - How many resource should be used?

You should not see hopper listed after that string in the log. If it is there, it should have been polled from an outdated bundle DB.

Francis-Liu commented 9 years ago

You are correct, I did not change fields "RADICAL_PILOT_DBURL" and "BUNDLE_DBURL" in demo_SC2014_env_setup.sh

However, after I changed those two fields, although hopper is removed from pilot submission targets, I still find it strangely trying to use user id 'mturilli' to connect to stampede.

Pilot submissions Pilot on resource xsede.stampede....... SUBMITTED to PM 54e4b23c033bb2194a0d3717 Pilot on resource xsede.blacklight..... SUBMITTED to PM 54e4b23c033bb2194a0d3717 Pilot on resource xsede.gordon......... SUBMITTED to PM 54e4b23c033bb2194a0d3717 Pilot on resource xsede.trestles....... SUBMITTED to PM 54e4b23c033bb2194a0d3717

Caught exception: prompted for unknown password (mturilli@login1.stampede.tacc.utexas.edu's password: ) (/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py +297 (_initialize_pty) : % match))

End of AIMES SC2014 demo.

Traceback (most recent call last): File "/home/grad03/fengl/mypyenv2.sc2014.demo/bin/demo_SC2014_script.py", line 875, in pilots = pmgr.submit_pilots(pdescs) File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/radical/pilot/pilot_manager.py", line 371, in submit_pilots resource_config=resource_cfg) File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/radical/pilot/controller/pilot_manager_controller.py", line 406, in register_start_pilot_request shell = sup.PTYShell (url, self._session, logger, opts={}) File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell.py", line 239, in init posix=self.posix) File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py", line 192, in initialize self._initialize_pty (info['pty'], info, is_shell=posix) File "/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py", line 395, in _initialize_pty raise ptye.translate_exception (e) saga.exceptions.AuthenticationFailed: prompted for unknown password (mturilli@login1.stampede.tacc.utexas.edu's password: ) (/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py +297 (_initialize_pty) : % match)) mv: cannot stat ‘54e4b23c033bb2194a0d3716.png’: No such file or directory mv: cannot stat ‘54e4b23c033bb2194a0d3716.pdf’: No such file or directory Can't stat run-21-54e4b23c033bb2194a0d3716/54e4b23c033bb2194a0d3716.png: No such file or directory run-21-54e4b23c033bb2194a0d3716/54e4b23c033bb2194a0d3716.png: unable to attach file.

This is how I set demo_SC2014_env_setup.sh

if test "$username" = "fengl" then export EMANAGER_DEBUG

export AIMES_USER_ID=fengl

export DEMO_FOLDER=/home/grad03/fengl/AIMES_demo_SC2014
export BUNDLE_CONF=~/mypyenv2.sc2014.demo/etc/bundle_demo_SC2014.conf
export SKELETON_CONF=~/mypyenv2.sc2014.demo/etc/skeleton_demo_SC2014.conf
export XSEDE_PROJECT_ID_STAMPEDE='TG-MCB090174'
export XSEDE_PROJECT_ID_TRESTLES='unc100'
export XSEDE_PROJECT_ID_GORDON='unc101'
export XSEDE_PROJECT_ID_BLACKLIGHT='unc102'
export RECIPIENTS=liux2102@umn.edu

fi

Set up Radical Pilot execution environment

export RADICAL_PILOT_DBURL='mongodb://54.221.194.147:24242/radicalpilot'

export RADICAL_PILOT_DBURL='mongodb://54.221.194.147:24242/AIMES_bundle_fengl' export RADICAL_PILOT_BENCHMARK= export SAGA_VERBOSE=debug export RADICAL_PILOT_VERBOSE=debug export RADICAL_UTILS_VERBOSE=debug export RADICAL_DEBUG_FILE=/tmp/aimes_demo_sc2014_debug.log export RADICAL_PILOT_LOG_TARGETS=$RADICAL_DEBUG_FILE export SAGA_LOG_TARGETS=$RADICAL_DEBUG_FILE export RADICAL_UTILS_LOG_TARGETS=$RADICAL_DEBUG_FILE

Set up eManager execution environment

export ORIGIN='54.196.51.239' export BUNDLE_DBURL='mongodb://54.221.194.147:24242/AIMES_bundle_fengl/'

Setup report

export RUN_TAG="AIMES demo SC2014"

And I confirmed there is not keyword mturilli in: ~/mypyenv2.sc2014.demo/etc/bundle_demo_SC2014.conf ~/.ssh/config

mturilli commented 9 years ago

Hi Francis,

please uncomment:

# export AIMES_USER_ID=fengl
Francis-Liu commented 9 years ago

I tried, but receive the same error result.

Caught exception: prompted for unknown password (mturilli@login1.stampede.tacc.utexas.edu's password: ) (/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py +297 (_initialize_pty) : % match))

I don't know how pilot manager finds your user id. Actually, the reason I commented out AIMES_USER_ID is because my user id on clusters is different from my local user id.

mturilli commented 9 years ago

Thank you Francis.

Here the code used to setup the user id:

USER_ID = os.environ.get('AIMES_USER_ID', None)

[...]

if USER_ID:
        context.user_id = USER_ID

When USER_ID is not exported, it is set to None. I do not see any mturilli in the python code. I see some in bin/demo_SC2014_env_setup.sh but they are guarded. Feel free to remove them from the configuration script, just in case but they should be completely irrelevant.

One thing I noticed. The first line of the error trace reports:

File "/home/grad03/fengl/mypyenv2.sc2014.demo/bin/demo_SC2014_script.py", line 875, in 
pilots = pmgr.submit_pilots(pdescs)

That code is indeed at line 885 in the current source code. An indication that you are not running the code from the current master branch?

Francis-Liu commented 9 years ago

I did a "install upgrade" and problem solved!

Thanks you Matteo.