radical-cybertools / radical.repex.at

This is the github location for RepEx developed by the RADICAL team in conjunction with the York Lab.
Other
4 stars 3 forks source link

RepEX cannot execute (04/23/15 morning fresh checkout). #23

Closed taisung closed 9 years ago

taisung commented 9 years ago
(myenv)[taisung@taisung-fedora amber_pattern_b_2d]$ sh launcher_amber_2d.sh
/usr/people/taisung/myenv/lib/python2.7/site-packages/pkg_resources/__init__.py:1222: UserWarning: /usr/people/taisung/.python-eggs is writable by group/others and vulnerable to attack when used with get_resource_filename. Consider a more secure location (set with .set_extraction_path or the PYTHON_EGG_CACHE environment variable).
  warnings.warn(msg, UserWarning)
2015:04:23 09:54:06 5515   MainThread   radical.repex.launcher-2d: [INFO    ] *********************************************************************
2015:04:23 09:54:06 5515   MainThread   radical.repex.launcher-2d: [INFO    ] *    RepEx simulation: AMBER + Salt Concentration + RE pattern B    *
2015:04:23 09:54:06 5515   MainThread   radical.repex.launcher-2d: [INFO    ] *********************************************************************
2015:04:23 09:54:06 5515   MainThread   radical.repex.pk      : [INFO    ] Using default Mongo DB url
2015:04:23 09:54:07 5515   MainThread   radical.repex.pk-patternB-2d: [INFO    ] Session ID: rp.session.taisung-fedora.taisung.016548.0001
2015:04:23 09:54:11 5515   MainThread   radical.repex.pk-patternB-2d: [INFO    ] Pilot ID: pilot.0000
2015:04:23 09:54:12 5515   Thread-1     radical.repex.pk-patternB-2d: [INFO    ] ComputePilot 'pilot.0000' state changed to Launching.
2015:04:23 09:54:19 5515   Thread-1     radical.repex.pk-patternB-2d: [INFO    ] ComputePilot 'pilot.0000' state changed to Failed.
2015:04:23 09:54:19 5515   Thread-1     radical.repex.pk-patternB-2d: [ERROR   ] Pilot error: [<radical.pilot.logentry.Logentry object at 0x7f8f000683d0>, <radical.pilot.logentry.Logentry object at 0x7f8f00068410>, <radical.pilot.logentry.Logentry object at 0x7f8f00068450>, <radical.pilot.logentry.Logentry object at 0x7f8f00068490>]
2015:04:23 09:54:19 5515   Thread-1     radical.repex.pk-patternB-2d: [ERROR   ] RepEx execution FAILED.
haoyuanchen commented 9 years ago

Hi Taisung,

I just tried to run 2d but I didn't get this error, it went through normally. In the line above the last line it says "Pilot error: [, , , ]" so I guess maybe it's some issues with radical.pilot rather than repex. By "fresh checkout" do you mean both repex and radical.pilot? Also, the warning in the beginning might suggest something.

Good luck! Haoyuan

taisung commented 9 years ago

I did fresh checkout for everything.

Taisung

antonst commented 9 years ago

Taisung, what is the version of radical pilot? e.g.: radicalpilot-version in terminal? Error is not very informative, so I don't really understand what went wrong. Do you have problems with 2d example or with temperature exchange as well? Was Pilot actually launched on the machine? or failure occured even before that? If pilot was launched, can you please provide logs?

taisung commented 9 years ago

I just tried to reproduce it but it works now…. I will try to figure out what is going on.

Taisung

taisung commented 9 years ago

Just tried it and the same error happened again. The job was not launched on the remote machine (Stampede).

Taisung

andre-merzky commented 9 years ago

Taisung, would you please provide us the agent.* files in the pilot sandbox on stampede? By default, those pilot sandboxes should live in ~/radical.pilot.sandbox/. You should find the correct sandbox created either via the session ID, or by listing them in chronological order (ls -rtl, then pick the last one).

Thanks, Andre.

taisung commented 9 years ago

This is the output of ls –alrt of the sandbox directory

drwx------    4 tg458185 G-81123   4096 Apr 13 08:17 pilot-552bc1b4bc3ea91b1923a99f
drwx------ 1029 tg458185 G-81123  86016 Apr 13 18:37 pilot-552c4e14bc3ea93fbde582d0
drwx------    2 tg458185 G-81123   4096 Apr 13 19:07 pilot-552c5a3dbc3ea9156b388802
drwx------ 2053 tg458185 G-81123 172032 Apr 13 21:46 pilot-552c7293bc3ea93694c0f044
drwx------    2 tg458185 G-81123   4096 Apr 13 18:12 rp.session.taisung-fedora.taisung.016538.0000-pilot.0000
drwx------    2 tg458185 G-81123   4096 Apr 17 13:49 rp.session.taisung-fedora.taisung.016542.0000-pilot.0000
drwx------    2 tg458185 G-81123   4096 Apr 22 07:56 rp.session.taisung-fedora.taisung.016547.0000-pilot.0000
drwx------    2 tg458185 G-81123   4096 Apr 22 08:14 rp.session.taisung-fedora.taisung.016547.0001-pilot.0000
drwx------    2 tg458185 G-81123   4096 Apr 22 13:32 rp.session.taisung-fedora.taisung.016547.0002-pilot.0000
drwx------    2 tg458185 G-81123   4096 Apr 22 20:15 rp.session.taisung-fedora.taisung.016548.0000-pilot.0000
drwx------    2 tg458185 G-81123   4096 Apr 23 08:54 rp.session.taisung-fedora.taisung.016548.0001-pilot.0000
drwx------  519 tg458185 G-81123  36864 Apr 23 19:21 rp.session.taisung-fedora.taisung.016549.0000-pilot.0000
drwx------    2 tg458185 G-81123   4096 Apr 23 19:26 rp.session.taisung-fedora.taisung.016549.0001-pilot.0000
drwx------    2 tg458185 G-81123   4096 Apr 23 19:50 rp.session.taisung-fedora.taisung.016549.0002-pilot.0000

It seems that after some point on 4/13, the radical.pilot has been changed.

For those rp.session….pilot.000 subdirectories, only rp.session.taisung-fedora.taisung.016549.0000-pilot.0000 has data (the only time worked today for me). All others only contain one single file pilot_bootstrapper.sh.

Taisung

andre-merzky commented 9 years ago

The change in names comes from the recent release we did. You seem to indicate that you did a 'fresh checkout' (it says so in one of the earlier messages). Does that also include creating a fresh virtualenv on your local machine? If not, please try that -- it might be that only a part of the stack got installed after the new release.

If you tried that already, then I would like to ask you for a log file from your client machine, ie. the output you get when running with RADICAL_PILOT_VERBOSE=DEBUG set in your environment.

Many thanks, Andre.

taisung commented 9 years ago

See log below. Every module was fresh checked-out. The virtualenv was newly created.

2015:04:23 21:29:19 20801  MainThread   radical.pilot         : [INFO    ] radical.pilot        version: v0.28-2-g7473553@master
2015:04:23 21:29:19 20801  MainThread   radical.repex.launcher-2d: [INFO    ] *********************************************************************
2015:04:23 21:29:19 20801  MainThread   radical.repex.launcher-2d: [INFO    ] *    RepEx simulation: AMBER + Salt Concentration + RE pattern B    *
2015:04:23 21:29:19 20801  MainThread   radical.repex.launcher-2d: [INFO    ] *********************************************************************
2015:04:23 21:29:19 20801  MainThread   radical.repex.pk      : [INFO    ] Using default Mongo DB url
2015:04:23 21:29:20 20801  MainThread   radical.pilot         : [INFO    ] using database url  mongodb://ec2-54-221-194-147.compute-1.amazonaws.com:24242/
2015:04:23 21:29:20 20801  MainThread   radical.pilot         : [INFO    ] using database name repex-tests
...

2015:04:23 21:29:30 20801  PilotLauncherWorker-1 radical.pilot         : [DEBUG   ] copy done: ['mput', 'Uploading', '//usr/people/taisung/myenv/lib/python2.7/site', 'sftp>']

2015:04:23 21:29:30 20801  PilotLauncherWorker-1 radical.pilot         : [DEBUG   ] Copying sdist 'file://localhost//usr/people/taisung/myenv/lib/python2.7/site-packages/radical.utils-0.28-py2.7.egg/radical/utils/radical.utils-v0.28-master.tar.gz' to sdist sandbox (sftp://stampede.tacc.utexas.edu/work/00661/tg458185/radical.pilot.sandbox/rp.session.taisung-fedora.taisung.016549.0003-pilot.0000/).
2015:04:23 21:29:31 20801  PilotLauncherWorker-1 radical.pilot         : [ERROR   ] Using bootstrapper /usr/people/taisung/myenv/lib/python2.7/site-packages/radical.pilot-0.28-py2.7.egg/radical/pilot/bootstrapper/default_bootstrapper.sh
Copying bootstrapper 'file://localhost//usr/people/taisung/myenv/lib/python2.7/site-packages/radical.pilot-0.28-py2.7.egg/radical/pilot/bootstrapper/default_bootstrapper.sh' to agent sandbox (sftp://stampede.tacc.utexas.edu/work/00661/tg458185/radical.pilot.sandbox/rp.session.taisung-fedora.taisung.016549.0003-pilot.0000//pilot_bootstrapper.sh).

Copying sdist 'file://localhost//usr/people/taisung/myenv/lib/python2.7/site-packages/radical.utils-0.28-py2.7.egg/radical/utils/radical.utils-v0.28-master.tar.gz' to sdist sandbox (sftp://stampede.tacc.utexas.edu/work/00661/tg458185/radical.pilot.sandbox/rp.session.taisung-fedora.taisung.016549.0003-pilot.0000/).

Pilot launching failed! (File does not exist: '//usr/people/taisung/myenv/lib/python2.7/site-packages/radical.utils-0.28-py2.7.egg/radical/utils/radical.utils-v0.28-master.tar.gz' -  (/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py +1050 (initialize)  :  raise saga.DoesNotExist("File does not exist: '%s' - %s" % (self.url.path, out))))

Traceback (most recent call last):

  File "/usr/people/taisung/myenv/lib/python2.7/site-packages/radical.pilot-0.28-py2.7.egg/radical/pilot/controller/pilot_launcher_worker.py", line 507, in run
    sdist_file = saga.filesystem.File(sdist_url)
  File "/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/filesystem/file.py", line 86, in __init__
    _adaptor, _adaptor_state, _ttype=_ttype)
  File "/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/namespace/entry.py", line 89, in __init__
    url, flags, session, ttype=_ttype)
  File "/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/base.py", line 101, in __init__
    self._init_task = self._adaptor.init_instance (adaptor_state, *args, **kwargs)
  File "/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/cpi/decorators.py", line 57, in wrap_function
    return sync_function (self, *args, **kwargs)
  File "/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py", line 1001, in init_instance
    self.initialize ()
  File "/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py", line 1050, in initialize
    raise saga.DoesNotExist("File does not exist: '%s' - %s" % (self.url.path, out))
DoesNotExist: File does not exist: '//usr/people/taisung/myenv/lib/python2.7/site-packages/radical.utils-0.28-py2.7.egg/radical/utils/radical.utils-v0.28-master.tar.gz' -  (/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py +1050 (initialize)  :  raise saga.DoesNotExist("File does not exist: '%s' - %s" % (self.url.path, out)))

2015:04:23 21:29:31 20801  Thread-1     radical.pilot         : [INFO    ] ComputePilot 'pilot.0000' state changed from 'Launching' to 'Failed'.
2015:04:23 21:29:31 20801  Thread-1     radical.repex.pk-patternB-2d: [INFO    ] ComputePilot 'pilot.0000' state changed to Failed.
2015:04:23 21:29:31 20801  Thread-1     radical.repex.pk-patternB-2d: [ERROR   ] Pilot error: [<radical.pilot.logentry.Logentry object at 0x7f45b42f4610>, <radical.pilot.logentry.Logentry object at 0x7f45b42f4650>, <radical.pilot.logentry.Logentry object at 0x7f45b42f4690>, <radical.pilot.logentry.Logentry object at 0x7f45b42f46d0>]
2015:04:23 21:29:31 20801  Thread-1     radical.repex.pk-patternB-2d: [ERROR   ] RepEx execution FAILED.
^C2015:04:23 22:26:21 20801  MainThread   radical.repex.launcher-2d: [INFO    ] Unexpected error: <type 'exceptions.KeyboardInterrupt'>
2015:04:23 22:26:21 20801  MainThread   radical.repex.launcher-2d: [INFO    ] Closing session.
marksantcroos commented 9 years ago

By default, those pilot sandboxes should live in ~/radical.pilot.sandbox/.

Note that this is not true, but Taisung found them by now I think.

marksantcroos commented 9 years ago

Pilot launching failed! (File does not exist: '//usr/people/taisung/myenv/lib/python2.7/site-packages/radical.utils-0.28-py2.7.egg/radical/utils/radical.utils-v0.28-master.tar.gz' - (/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py +1050 (initialize) : raise saga.DoesNotExist("File does not exist: '%s' - %s" % (self.url.path, out))))

I'm seeing a similar thing on a different machine, seems that sdist is not created (correctly).

andre-merzky commented 9 years ago

I'm seeing a similar thing on a different machine, seems that sdist is not created (correctly).

Yes, indeed, but I can't reproduce this, yet. Do you have an env or shell script for me which triggers that error? The version of python, pip and setuptools would be interesting, too, I guess.

marksantcroos commented 9 years ago

The sdist should not be required in the first place. This points to wrong defaults in the resource configuration file(s).

With regards to the absence of the sdist, that only gets created I think when you actively install the dependencies from source, not when they are pulled in.

andre-merzky commented 9 years ago

Ack on the resource config -- but I can't reproduce the problem with the sdists:

merzky@cameo:/tmp $ virtualenv ve | 0
merzky@cameo:/tmp $ source ve/bin/activate
(ve)merzky@cameo:/tmp $ pip install radical.pilot | 0
(ve)merzky@cameo:/tmp $ du -a ve | grep tar.gz
76  ve/lib/python2.7/site-packages/radical/utils/radical.utils-0.28.tar.gz
344 ve/lib/python2.7/site-packages/radical/pilot/radical.pilot-0.28.tar.gz
400 ve/lib/python2.7/site-packages/saga/saga-python-0.28.tar.gz

Any other idea how to reproduce this? In what context were the sdists missing for you?

marksantcroos commented 9 years ago

Ack on the resource config

Created https://github.com/radical-cybertools/radical.pilot/issues/578 for that.

Any other idea how to reproduce this? In what context were the sdists missing for you?

I actually tried, but failed, can't reproduce it anymore. It was on the head node of titan, with some combination of "manual" installation.

marksantcroos commented 9 years ago

Maybe Taisung can ...

andre-merzky commented 9 years ago

Taisung,

if you continue to have trouble with the current release and if that stop you from doing work, you may want to revert to the previous version:

 $ virtualenv ve
 $ source ve/bin/activate
 $ pip install 'radical.utils<0.28'
 $ pip install 'saga-python<0.28'
 $ pip install 'radical.pilot<0.28'
 $ radicalpilot-version

should give you (with RADICAL_VERBOSE, SAGA_VERBOSE and RADICAL_PILOT_VERBOSE all set to DEBUG:

2015:04:26 23:41:31 3692   MainThread   radical               : [INFO    ] python.interpreter   version: 2.7.5+ (default, Sep 17 2013, 15:31:50) [GCC 4.8.1]
2015:04:26 23:41:31 3692   MainThread   radical               : [INFO    ] radical.utils        version: v0.8 (v0.8)
2015:04:26 23:41:31 radical.pilot.MainProcess: [INFO    ] radical.pilot version: 0.26 (v0.26)
2015:04:26 23:41:31 3692   MainThread   saga                  : [INFO    ] saga-python          version: v0.27
v0.26

But we would still be interested to know, of course, what caused your troubles, so please let us know if you have any details on how you installed the new radical.pilot release (and dependencies).

Best, Andre.

taisung commented 9 years ago

Antons seemed to identity the problem: the problem will occur if I run “python setup.py install” for each module, but will be OK if I use “pip install --upgrade .” instead.

You may try if you can reproduce it.

Taisung

From: Andre Merzky [mailto:notifications@github.com] Sent: Sunday, April 26, 2015 5:44 PM To: radical-cybertools/RepEx Cc: taisung Subject: Re: [RepEx] RepEX cannot execute (04/23/15 morning fresh checkout). (#23)

Taisung,

if you continue to have trouble with the current release and if that stop you from doing work, you may want to revert to the previous version:

$ virtualenv ve $ source ve/bin/activate $ pip install 'radical.utils<0.28' $ pip install 'saga-python<0.28' $ pip install 'radical.pilot<0.28' $ radicalpilot-version

should give you (with RADICAL_VERBOSE, SAGA_VERBOSE and RADICAL_PILOT_VERBOSE all set to DEBUG:

2015:04:26 23:41:31 3692 MainThread radical : [INFO ] python.interpreter version: 2.7.5+ (default, Sep 17 2013, 15:31:50) [GCC 4.8.1] 2015:04:26 23:41:31 3692 MainThread radical : [INFO ] radical.utils version: v0.8 (v0.8) 2015:04:26 23:41:31 radical.pilot.MainProcess: [INFO ] radical.pilot version: 0.26 (v0.26) 2015:04:26 23:41:31 3692 MainThread saga : [INFO ] saga-python version: v0.27 v0.26

But we would still be interested to know, of course, what caused your troubles, so please let us know if you have any details on how you installed the new radical.pilot release (and dependencies).

Best, Andre.

— Reply to this email directly or view it on GitHub https://github.com/radical-cybertools/RepEx/issues/23#issuecomment-96438326 .Image removed by sender.

taisung commented 9 years ago

Please try the attached script to see if you can reproduce the bug by running it

“source FreshRun new”.

I found that, in the script:

  1. Using python setup.py to install modules-> no luck
  2. Using pip to install modules

a. Comment out the last statement “sh launcher_amber_2d.sh”; and run the last statement separately à works

b. Include the last statement “sh launcher_amber_2d.sh” à sometime works, sometime doesn’t.

I tried on two separate machines and got the same results. Maybe you guys can run and try. (Remember to change the directory setting)

Taisung

From: Taisung Lee [mailto:taisung@gmail.com] Sent: Sunday, April 26, 2015 10:23 PM To: 'radical-cybertools/RepEx'; 'radical-cybertools/RepEx' Subject: RE: [RepEx] RepEX cannot execute (04/23/15 morning fresh checkout). (#23)

Antons seemed to identity the problem: the problem will occur if I run “python setup.py install” for each module, but will be OK if I use “pip install --upgrade .” instead.

You may try if you can reproduce it.

Taisung

From: Andre Merzky [mailto:notifications@github.com] Sent: Sunday, April 26, 2015 5:44 PM To: radical-cybertools/RepEx Cc: taisung Subject: Re: [RepEx] RepEX cannot execute (04/23/15 morning fresh checkout). (#23)

Taisung,

if you continue to have trouble with the current release and if that stop you from doing work, you may want to revert to the previous version:

$ virtualenv ve $ source ve/bin/activate $ pip install 'radical.utils<0.28' $ pip install 'saga-python<0.28' $ pip install 'radical.pilot<0.28' $ radicalpilot-version

should give you (with RADICAL_VERBOSE, SAGA_VERBOSE and RADICAL_PILOT_VERBOSE all set to DEBUG:

2015:04:26 23:41:31 3692 MainThread radical : [INFO ] python.interpreter version: 2.7.5+ (default, Sep 17 2013, 15:31:50) [GCC 4.8.1] 2015:04:26 23:41:31 3692 MainThread radical : [INFO ] radical.utils version: v0.8 (v0.8) 2015:04:26 23:41:31 radical.pilot.MainProcess: [INFO ] radical.pilot version: 0.26 (v0.26) 2015:04:26 23:41:31 3692 MainThread saga : [INFO ] saga-python version: v0.27 v0.26

But we would still be interested to know, of course, what caused your troubles, so please let us know if you have any details on how you installed the new radical.pilot release (and dependencies).

Best, Andre.

— Reply to this email directly or view it on GitHub https://github.com/radical-cybertools/RepEx/issues/23#issuecomment-96438326 .Image removed by sender.

andre-merzky commented 9 years ago

I'm afraid I don't see the script attached, at least not on the ticket on https://github.com/radical-cybertools/RepEx/issues/23#issuecomment-97039149 . Could you please send it by mail, or paste it if it is not too long? Thanks!

antonst commented 9 years ago

If you do:

rm -rf $HOME/myenv;
virtualenv $HOME/myenv;
source $HOME/myenv/bin/activate;
export RADICAL_PILOT_VERBOSE=info;
export RADICLA_REPEX_VERBOSE=info;
git clone -b feature/2d-prof https://github.com/radical-cybertools/RepEx.git;
cd RepEx;
python setup.py install;
cd examples/amber_pattern_b_2d

modify amber_input.json, e.g. allocation, resource etc.

sh ./launcher_amber_2d.sh

everything should work. There is a bug in feature/2d-prof branch (will be fixed soon) but for small number of replicas you should not experience any problems.

For your script it should be sufficient to remove:

for module in "radical.utils" "saga-python" "radical.pilot" 
do
   echo "\n Doing $module \n"
   if [ "$#" -gt "0" ]; then 
    cd $cdi_root
    rm -rf $module
    git clone -b $branch https://github.com/radical-cybertools/$module.git
        cd $cdi_root/$module
   else
        cd $cdi_root/$module
    git pull origin $branch
   fi
   #python setup.py install
   pip install --upgrade .
done

since 'python setup.py install' in RepEx installs radical pilot and it's dependencies anyway, so I don't see any use for that loop. If you want to re-install everything just do;

rm -rf $HOME/myenv;
virtualenv $HOME/myenv;
source $HOME/myenv/bin/activate;
export RADICAL_PILOT_VERBOSE=info;
export RADICLA_REPEX_VERBOSE=info;
cd RepEx;
git pull origin feature/2d-prof;
python setup.py install;
cd examples/amber_pattern_b_2d
antonst commented 9 years ago

Original script is:

cdi_root=/Data/CDI-SAGA
branch=master

rm -rf $HOME/myenv
virtualenv $HOME/myenv
source $HOME/myenv/bin/activate
export RADICAL_PILOT_VERBOSE=DEBUG 

for module in "radical.utils" "saga-python" "radical.pilot" 
do
   echo "\n Doing $module \n"
   if [ "$#" -gt "0" ]; then 
    cd $cdi_root
    rm -rf $module
    git clone -b $branch https://github.com/radical-cybertools/$module.git
        cd $cdi_root/$module
   else
        cd $cdi_root/$module
    git pull origin $branch
   fi
   #python setup.py install
   pip install --upgrade .
done

module=RepEx
branch=feature/2d-prof
   echo "\n Doing $module \n"
   if [ "$#" -gt "0" ]; then 
        cd $cdi_root
        rm -rf $module
        git clone -b $branch https://github.com/radical-cybertools/$module.git
        cd $cdi_root/$module
   else
        cd $cdi_root/$module
    rm -rf $cdi_root/RepEx/examples/amber_pattern_b_2d/amber_input.json
        git pull origin $branch
   fi
   python setup.py install

cd $cdi_root/RepEx/examples/amber_pattern_b_2d
cp $cdi_root/amber_input.json.saved amber_input.json
sh launcher_amber_2d.sh
antonst commented 9 years ago

I can reproduce the problem with:

rm -rf $HOME/myenv;
virtualenv $HOME/myenv;
source $HOME/myenv/bin/activate;
export RADICAL_PILOT_VERBOSE=info;
export RADICLA_REPEX_VERBOSE=info;

git clone -b master https://github.com/radical-cybertools/radical.utils.git;
cd radical.utils;
python setup.py install;
cd ..;

git clone -b master https://github.com/radical-cybertools/saga-python.git;
cd saga-python;
python setup.py install;
cd ..;

git clone -b master https://github.com/radical-cybertools/radical.pilot.git;
cd radical.pilot;
python setup.py install;
cd ..;

cd RepEx;
git pull origin feature/2d-prof;
python setup.py install;
cd examples/amber_pattern_b_2d;

which for the same simulation would give:

2015:04:29 02:27:48 13906  PilotLauncherWorker-1 radical.pilot         : [ERROR   ] Using bootstrapper /home/treikalis/myenv/lib/python2.7/site-packages/radical.pilot-0.29-py2.7.egg/radical/pilot/bootstrapper/default_bootstrapper.sh
Copying bootstrapper 'file://localhost//home/treikalis/myenv/lib/python2.7/site-packages/radical.pilot-0.29-py2.7.egg/radical/pilot/bootstrapper/default_bootstrapper.sh' to agent sandbox (sftp://stampede.tacc.utexas.edu/work/02457/antontre/radical.pilot.sandbox/rp.session.ip-10-184-31-85.treikalis.016554.0004-pilot.0000//pilot_bootstrapper.sh).
Copying sdist 'file://localhost//home/treikalis/myenv/local/lib/python2.7/site-packages/radical.utils-0.28-py2.7.egg/radical/utils/radical.utils-v0.28-master.tar.gz' to sdist sandbox (sftp://stampede.tacc.utexas.edu/work/02457/antontre/radical.pilot.sandbox/rp.session.ip-10-184-31-85.treikalis.016554.0004-pilot.0000/).
Pilot launching failed! (File does not exist: '//home/treikalis/myenv/local/lib/python2.7/site-packages/radical.utils-0.28-py2.7.egg/radical/utils/radical.utils-v0.28-master.tar.gz' -  (/home/treikalis/myenv/local/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py +1050 (initialize)  :  raise saga.DoesNotExist("File does not exist: '%s' - %s" % (self.url.path, out))))
Traceback (most recent call last):
  File "/home/treikalis/myenv/lib/python2.7/site-packages/radical.pilot-0.29-py2.7.egg/radical/pilot/controller/pilot_launcher_worker.py", line 507, in run
    sdist_file = saga.filesystem.File(sdist_url)
  File "/home/treikalis/myenv/local/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/filesystem/file.py", line 86, in __init__
    _adaptor, _adaptor_state, _ttype=_ttype)
  File "/home/treikalis/myenv/local/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/namespace/entry.py", line 89, in __init__
    url, flags, session, ttype=_ttype)
  File "/home/treikalis/myenv/local/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/base.py", line 101, in __init__
    self._init_task = self._adaptor.init_instance (adaptor_state, *args, **kwargs)
  File "/home/treikalis/myenv/local/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/cpi/decorators.py", line 57, in wrap_function
    return sync_function (self, *args, **kwargs)
  File "/home/treikalis/myenv/local/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py", line 1001, in init_instance
    self.initialize ()
  File "/home/treikalis/myenv/local/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py", line 1050, in initialize
    raise saga.DoesNotExist("File does not exist: '%s' - %s" % (self.url.path, out))
DoesNotExist: File does not exist: '//home/treikalis/myenv/local/lib/python2.7/site-packages/radical.utils-0.28-py2.7.egg/radical/utils/radical.utils-v0.28-master.tar.gz' -  (/home/treikalis/myenv/local/lib/python2.7/site-packages/saga_python-0.28-py2.7.egg/saga/adaptors/shell/shell_file.py +1050 (initialize)  :  raise saga.DoesNotExist("File does not exist: '%s' - %s" % (self.url.path, out)))
taisung commented 9 years ago

Hi Antons,

           Thanks for your comments.  Would you mind to try the script I sent to you?

           If you are right and the loop is useless, then why it stops the execution?

Taisung

From: Antons [mailto:notifications@github.com] Sent: Tuesday, April 28, 2015 10:05 PM To: radical-cybertools/RepEx Cc: taisung Subject: Re: [RepEx] RepEX cannot execute (04/23/15 morning fresh checkout). (#23)

If you do:

rm -rf $HOME/myenv; virtualenv $HOME/myenv; source $HOME/myenv/bin/activate; export RADICAL_PILOT_VERBOSE=info; export RADICLA_REPEX_VERBOSE=info; git clone -b feature/2d-prof https://github.com/radical-cybertools/RepEx.git; cd RepEx; python setup.py install; cd examples/amber_pattern_b_2d

modify amber_input.json, e.g. allocation, resource etc.

sh ./launcher_amber_2d.sh

everything should work. There is a bug in feature/2d-prof branch (will be fixed soon) but for small number of replicas you should not experience any problems.

For your script it should be sufficient to remove:

for module in "radical.utils" "saga-python" "radical.pilot" do echo "\n Doing $module \n" if [ "$#" -gt "0" ]; then cd $cdi_root rm -rf $module git clone -b $branch https://github.com/radical-cybertools/$module.git cd $cdi_root/$module else cd $cdi_root/$module git pull origin $branch fi

python setup.py install

pip install --upgrade . done

since 'python setup.py install' in RepEx installs radical pilot and it's dependencies anyway, so I don't see any use for that loop. If you want to re-install everything just do;

rm -rf $HOME/myenv; virtualenv $HOME/myenv; source $HOME/myenv/bin/activate; export RADICAL_PILOT_VERBOSE=info; export RADICLA_REPEX_VERBOSE=info; cd RepEx; git pull origin feature/2d-prof; python setup.py install; cd examples/amber_pattern_b_2d

— Reply to this email directly or view it on GitHub https://github.com/radical-cybertools/RepEx/issues/23#issuecomment-97283195 .Image removed by sender.

antonst commented 9 years ago

Would you mind to try the script I sent to you?

OK, I will.

If you are right and the loop is useless, then why it stops the execution?

I don't understand what you mean. All I am trying to say is there is no need to install rp saga-python and utils separately if all these are from master branch.

antonst commented 9 years ago

Taisung, I have slightly modified your script (should not affect any behavior) and it worked fine for me:

cdi_root=/home/treikalis/experiments/repex-t
branch=master

rm -rf $HOME/myenv
virtualenv $HOME/myenv
source $HOME/myenv/bin/activate

export RADICAL_PILOT_VERBOSE=DEBUG

for module in "radical.utils" "saga-python" "radical.pilot"
do
   echo "\n Doing $module \n"
   if [ "$#" -gt "0" ]; then
        cd $cdi_root
        rm -rf $module
        git clone -b $branch https://github.com/radical-cybertools/$module.git
        cd $cdi_root/$module
   else
        #cd $cdi_root/$module
        #git pull origin $branch
        cd $cdi_root
        rm -rf $module
        git clone -b $branch https://github.com/radical-cybertools/$module.git
        cd $module
   fi
   pip install --upgrade .
done

module=RepEx
branch=feature/2d-prof
   echo "\n Doing $module \n"
   if [ "$#" -gt "0" ]; then
        cd $cdi_root
        rm -rf $module
        git clone -b $branch https://github.com/radical-cybertools/$module.git
        cd $cdi_root/$module
   else
        cd $cdi_root/$module
        #rm -rf $cdi_root/RepEx/examples/amber_pattern_b_2d/amber_input.json
        git pull origin $branch
   fi
   python setup.py install

cd examples/amber_pattern_b_2d
sh launcher_amber_2d.sh