Open csampat opened 7 years ago
This is an RP issue. Specifically #1468 .
Please cancel your jobs for now. I will inform you for the next steps
This is also a duplicate of #26.
Please clone radical.pilot, radical.utils and saga-python repos, if you haven't cloned them already.
With a python virtualenv enabled do:
cd <PATH>/radical.utils
git checkout rc/v0.46.3
pip install . --upgrade
cd <PATH>/saga-python
git checkout rc/v0.46.3
pip install . --upgrade
cd <PATH>/radical.pilot
git checkout experiment/cybermanufacturing
pip install . --upgrade
After installing everything the radical-stack
should look like:
python : 2.7.14
pythonpath :
virtualenv : RpCyberExp
radical.pilot : 0.47-v0.46.2-186-g2648ca4@experiment-cybermanufacturing
radical.utils : 0.47-v0.46-63-gc1ae8ac@rc-v0.46.3
saga : 0.47-v0.46-20-g8ea2302@rc-v0.46.3
And give it a try
Okay I did all that but my radical-stack
Successfully installed saga-python-0.46.1
(rp_2) chai@xcalibur:~/Documents/git/saga-python$ radical-stack
python : 2.7.13
pythonpath :
virtualenv : /home/chai/Documents/git/rp_fix/src/RADICAL_Pilot/rp_2
radical.pilot : 0.47-v0.46.2-18-ge0355d21@fix-ibrun_cpn
radical.utils : 0.47-v0.46-10-gc515db1@devel
saga : 0.47-v0.46-5-g74fc3811@devel
(rp_2) chai@xcalibur:~/Documents/git/saga-python$ python -c "import saga;print saga.version_detail"
0.46.1-v0.46-1-gabcd7b68@rc-v0.46.3
(rp_2) chai@xcalibur:~/Documents/git/saga-python$ python -c "import radical.utils;print radical.utils.version_detail"
0.46.2-v0.46-4-gbac8d67@rc-v0.46.3
(rp_2) chai@xcalibur:~/Documents/git/saga-python$ python -c "import radical.pilot;print radical.pilot.version_detail"
0.47-v0.46.2-186-g2648ca47@experiment-cybermanufacturing
(rp_2) chai@xcalibur:~/Documents/git/saga-python$
okay today morning I did a fresh clone and reinstalled, but I guess now I am a few commits ahead of the versions you have. New radical-stack:
(rp_2) chai@xcalibur:~/Documents/git/saga-python$ radical-stack
python : 2.7.13
pythonpath :
virtualenv : /home/chai/Documents/git/rp_fix/src/RADICAL_Pilot/rp_2
radical.pilot : 0.47-v0.46.2-186-g2648ca47@experiment-cybermanufacturing
radical.utils : 0.47-v0.46-73-gd580ab1@rc-v0.46.3
saga : 0.47-v0.46-32-ga2f9dedc@rc-v0.46.3
It looks okay!
So the client side indicates that all the CUs have completed and it calculates the total time required but this does not show on stampede2. The jobs are still running on the node. The jobs completed in about 12 - 13 hours but on a
showq -u
the older jobs are still executing and have reached 16 hours of execution time