Open andre-merzky opened 11 years ago
Hello, this bug is not replicate-able.
what does your .bashrc look like? Do you have 'module load torque'? It also looks like you are doing a module load python and overwriting whatever is in your virtualenv.
Hi Melissa,
my .bashrc contains:
module load git
module load vim
module load python
source ~/.bigjob/python/bin/activate
and nothing (of importance) thereafter. I have no torque module, but on login I see:
torque/2.5.5 version 2.5.5 loaded
moab version 5.4.0 loaded
and also I can check with:
(python)[merzky@i136 ~]$ module list
Currently Loaded Modulefiles:
1) torque/2.5.5 2) moab/5.4.0 3) git/1.7.8.3 4) vim/7.2 5) python/2.7
That looks ok I assume?
Do you happen to have any further ideas on how to debug this? But also: no matter what the user setup is, it should not result in such a low-level exception IMHO...
Many thanks! :-)
Andre.
Hi Andre!
I took a look...
I am not using the Python module, but used python2.6 directly e.g.
azebro1@i136:~$ python
Python 2.4.3 (#1, Oct 23 2012, 22:02:41)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-54)] on linux2
azebro1@i136:~$ python2.6 --version
Python 2.6.8
azebro1@i136:~$ source ~/saga-python-env/bin/activate
(saga-python-env)azebro1@i136:~$ python --version
Python 2.6.8
So, maybe make a virtualenv with python2.6 and give that a spin?
Here are my modules:
torque/2.5.5 version 2.5.5 loaded
moab version 5.4.0 loaded
git version 1.7.8.3 loaded
What -may- be happening is that the module add python is being invoked in your .bashrc somehow during the bootstrap, which executes the module add python command and clobbers your virtualenv.
Could you try removing the python line from your bashrc and using python2.6 to create the virtualenv? That would (hopefully) clear things up...
Thanks for the feedback, Ashley! I tried that, but alas it only changes the error to:
(python)[merzky@i136 agent]$ cat stderr-bj-b7ae3afa-d20d-11e2-ab76-00231582da34-agent.txt
git version 1.7.8.3 loaded
python: error while loading shared libraries: libpython2.6.so.1.0: cannot open shared object file: No such file or directory
so it seems that the 2.6 installation does not include the devel part, which hick-ups on the paramiko dependency of BigJob (I guess)... Am I missing any settings for the devel libs?
Thanks, Andre.
@andre-merzky - is this still a ticket for BigJob? It seems like a ticket for yourself and your environment...
Let me check if the problem persists...
Andre -- what's the status?
I didn't look at BJ/india for a while -- but will have to do so sometime soon anyways, so will report back then.
This seems to be the same problem as for sagapilot: when I source a virtualenv in my ~/.bashrc, things go terribly wrong with the agents. IMHO, the agent bootstrap script should call deactivate
as the very fist action...
On india, with python-2.7 module loaded, and using develop-prod, I see:
The trouble-making line in question seems to get a
None
return value from theinstall_python
method -- but that method is somewhat, aehm, complex, so I am not sure how to debug it in the remote reployment setting. Any advise?