FNNDSC / projectman

FNNDSC project manager
3 stars 0 forks source link

(Martinos) ChRIS -- job execution #101

Closed rudolphpienaar closed 9 years ago

rudolphpienaar commented 9 years ago

machris jobs on cluster

Overview

As of 8-Sept-2015 it seems as if jobs on machris are not actually running on the cluster.

rudolphpienaar commented 9 years ago

@NicolasRannou could you try and take a look... ugh... I don't know why jobs don't seem to want to run now. We should document all the debugging steps so that I can also properly look at this.

rudolphpienaar commented 9 years ago

to get to machris, from BCH, goto https://tautona:1143

NicolasRannou commented 9 years ago

It seems that the version of glibc.so on the MGH cluster changed...

[rudolph@launchpad:x86_64-Linux]~$>source /space/machris/users/rudolph/mri_convert/9_9_2015_10_43_18-1679/0_0_4583649-010Y-F/_chrisRun_/chris.env && /space/machris/src/chrisreloaded/lib/_common/crun.py -u rudolph --out /space/machris/users/rudolph/mri_convert/9_9_2015_10_43_18-1679/0_0_4583649-010Y-F/_chrisRun_ --err /space/machris/users/rudolph/mri_convert/9_9_2015_10_43_18-1679/0_0_4583649-010Y-F/_chrisRun_ --host launchpad -s crun_hpc_launchpad --saveJobID /space/machris/users/rudolph/mri_convert/9_9_2015_10_43_18-1679/0_0_4583649-010Y-F/_chrisRun_ -c '/bin/bash /space/machris/users/rudolph/mri_convert/9_9_2015_10_43_18-1679/0_0_4583649-010Y-F/_chrisRun_/chris.run'
Traceback (most recent call last):
  File "/space/machris/src/chrisreloaded/lib/_common/crun.py", line 27, in <module>
    import systemMisc as misc
  File "/autofs/space/machris/src/chrisreloaded/lib/_common/systemMisc.py", line 34, in <module>
    from            numpy           import *
  File "/space/machris/lib/py/numpy/__init__.py", line 153, in <module>
    from . import add_newdocs
  File "/space/machris/lib/py/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/space/machris/lib/py/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/space/machris/lib/py/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/space/machris/lib/py/numpy/core/__init__.py", line 6, in <module>
    from . import multiarray
ImportError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /space/machris/lib/py/numpy/core/multiarray.so)

So if we update numpy in /space/machris/lib/py/numpy it should work.

However I am not sure which it the best way to update it.

glibc 2.19 on machris.

[chris@machris:x86_64-Linux]...aded/controller(master@4506-aded-)$>/lib/x86_64-linux-gnu/libc.so.6 
GNU C Library (Ubuntu GLIBC 2.19-10ubuntu2.3) stable release version 2.19, by Roland McGrath et al.

glibc 2.12 on launchpad.

[rudolph@launchpad:x86_64-Linux]~$>/lib64/libc.so.6
GNU C Library stable release version 2.12, by Roland McGrath et al.

The strange thing is that it used to work... anyway keep digging...

NicolasRannou commented 9 years ago

On a different but related node, crun should probably be a project by itself so we can test it properly on a daily base and also, it could only import what it needs (not sure why it would need numpy)

NicolasRannou commented 9 years ago

it seems that there is something wrong with the paths being sourced:

$> ssh rudolph@launchpad
$> [rudolph@launchpad:x86_64-Linux]~$>/space/machris/src/chrisreloaded/lib/_common/crun.py --help
/autofs/space/machris/src/chrisreloaded/lib/_common/systemMisc.py:37: DeprecationWarning: The popen2 module is deprecated.  Use the subprocess module.
  import          popen2, fcntl, select
usage: crun.py [-h] [-u USER] [--host HOST] [--port PORT] [-s SCHEDULER]
               [-q QUEUE] [-o OUT] [-e ERR] [-m MAIL] [--waitForChild]
               [--no-waitForChild] [--echo] [--no-echo] [--echoStdOut]
               [--no-echoStdOut] [--detach] [--no-detach] [--sshDetach]
               [--no-sshDetach] [--printElapsedTime] [--no-printElapsedTime]
               [--setDefaultFlags] [--no-setDefaultFlags] [--blockOnChild]
               [--kill jobID/FILE | -c COMMAND] [--saveJobID PATH]
NicolasRannou commented 9 years ago
[rudolph@launchpad:x86_64-Linux]~$>export PYTHONPATH=/space/machris/lib/py:$PYTHONPATH
[rudolph@launchpad:x86_64-Linux]~$>/space/machris/src/chrisreloaded/lib/_common/crun.py --help
Traceback (most recent call last):
  File "/space/machris/src/chrisreloaded/lib/_common/crun.py", line 27, in <module>
    import systemMisc as misc
  File "/autofs/space/machris/src/chrisreloaded/lib/_common/systemMisc.py", line 34, in <module>
    from            numpy           import *
  File "/space/machris/lib/py/numpy/__init__.py", line 153, in <module>
    from . import add_newdocs
  File "/space/machris/lib/py/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/space/machris/lib/py/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/space/machris/lib/py/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/space/machris/lib/py/numpy/core/__init__.py", line 6, in <module>
    from . import multiarray
ImportError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /space/machris/lib/py/numpy/core/multiarray.so)
NicolasRannou commented 9 years ago

@rudolphpienaar Did you add anything in:

/home/chris/lib/py

by any chance?

rudolphpienaar commented 9 years ago

Good bug hunting!

I added some modules to that path for "med2image". Hmmm... I can't imagine they'd interfere. Will check when at office.

NicolasRannou commented 9 years ago

Yes that is a bit odd...

NicolasRannou commented 9 years ago

@rudolphpienaar can we close this one?

rudolphpienaar commented 9 years ago

Yes. Closing...