Closed Francis-Liu closed 9 years ago
Hi Francis,
can you please make doubly sure that you are using the same darabase URL for both, pupulating via bundles and running the actual demo? The demo should not be able to pick up hosts which are not reported in the DB...
Matteo, do you see any other code path how hopper might creep back in again?
I double checked with the code and it seems to me that hopper should not creep back. This is how the resources are selected (part of it, just their name here):
for resource_name in bundle.resources:
resource = bundle.resources[resource_name]
In the log of the run, the target resource set is printed. Look for the sting:
Decision D03 based on D02, E02 - How many resource should be used?
You should not see hopper listed after that string in the log. If it is there, it should have been polled from an outdated bundle DB.
You are correct, I did not change fields "RADICAL_PILOT_DBURL" and "BUNDLE_DBURL" in demo_SC2014_env_setup.sh
However, after I changed those two fields, although hopper is removed from pilot submission targets, I still find it strangely trying to use user id 'mturilli' to connect to stampede.
Pilot submissions Pilot on resource xsede.stampede....... SUBMITTED to PM 54e4b23c033bb2194a0d3717 Pilot on resource xsede.blacklight..... SUBMITTED to PM 54e4b23c033bb2194a0d3717 Pilot on resource xsede.gordon......... SUBMITTED to PM 54e4b23c033bb2194a0d3717 Pilot on resource xsede.trestles....... SUBMITTED to PM 54e4b23c033bb2194a0d3717
Caught exception: prompted for unknown password (mturilli@login1.stampede.tacc.utexas.edu's password: ) (/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py +297 (_initialize_pty) : % match))
Traceback (most recent call last):
File "/home/grad03/fengl/mypyenv2.sc2014.demo/bin/demo_SC2014_script.py", line 875, in
This is how I set demo_SC2014_env_setup.sh
if test "$username" = "fengl" then export EMANAGER_DEBUG
export DEMO_FOLDER=/home/grad03/fengl/AIMES_demo_SC2014
export BUNDLE_CONF=~/mypyenv2.sc2014.demo/etc/bundle_demo_SC2014.conf
export SKELETON_CONF=~/mypyenv2.sc2014.demo/etc/skeleton_demo_SC2014.conf
export XSEDE_PROJECT_ID_STAMPEDE='TG-MCB090174'
export XSEDE_PROJECT_ID_TRESTLES='unc100'
export XSEDE_PROJECT_ID_GORDON='unc101'
export XSEDE_PROJECT_ID_BLACKLIGHT='unc102'
export RECIPIENTS=liux2102@umn.edu
fi
export RADICAL_PILOT_DBURL='mongodb://54.221.194.147:24242/AIMES_bundle_fengl' export RADICAL_PILOT_BENCHMARK= export SAGA_VERBOSE=debug export RADICAL_PILOT_VERBOSE=debug export RADICAL_UTILS_VERBOSE=debug export RADICAL_DEBUG_FILE=/tmp/aimes_demo_sc2014_debug.log export RADICAL_PILOT_LOG_TARGETS=$RADICAL_DEBUG_FILE export SAGA_LOG_TARGETS=$RADICAL_DEBUG_FILE export RADICAL_UTILS_LOG_TARGETS=$RADICAL_DEBUG_FILE
export ORIGIN='54.196.51.239' export BUNDLE_DBURL='mongodb://54.221.194.147:24242/AIMES_bundle_fengl/'
export RUN_TAG="AIMES demo SC2014"
And I confirmed there is not keyword mturilli in: ~/mypyenv2.sc2014.demo/etc/bundle_demo_SC2014.conf ~/.ssh/config
Hi Francis,
please uncomment:
# export AIMES_USER_ID=fengl
I tried, but receive the same error result.
Caught exception: prompted for unknown password (mturilli@login1.stampede.tacc.utexas.edu's password: ) (/home/grad03/fengl/mypyenv2.sc2014.demo/local/lib/python2.7/site-packages/saga/utils/pty_shell_factory.py +297 (_initialize_pty) : % match))
I don't know how pilot manager finds your user id. Actually, the reason I commented out AIMES_USER_ID is because my user id on clusters is different from my local user id.
Thank you Francis.
Here the code used to setup the user id:
USER_ID = os.environ.get('AIMES_USER_ID', None)
[...]
if USER_ID:
context.user_id = USER_ID
When USER_ID is not exported, it is set to None. I do not see any mturilli in the python code. I see some in bin/demo_SC2014_env_setup.sh
but they are guarded. Feel free to remove them from the configuration script, just in case but they should be completely irrelevant.
One thing I noticed. The first line of the error trace reports:
File "/home/grad03/fengl/mypyenv2.sc2014.demo/bin/demo_SC2014_script.py", line 875, in
pilots = pmgr.submit_pilots(pdescs)
That code is indeed at line 885 in the current source code. An indication that you are not running the code from the current master branch?
I did a "install upgrade" and problem solved!
Thanks you Matteo.
I have commented out hopper from bundle bundle_demo_SC2014.conf.
Now hopper is not in resource list:
However, I still got the authentication failure error on hopper: