cnr-ibf-pa / hbp-bsp-issues

Ticketing system for developers/testers and power users of the Brain Simulation Platform of the Human Brain Project
4 stars 0 forks source link

Installation of neuron7.7 on PizDaint, Jureca and Galileo #525

Closed clupascu closed 4 years ago

clupascu commented 4 years ago

As Synaptic fitting usecases must be updated to Python3 I need to know if it would be possible to install neuron7.7 (that works with Python3) on PizDaint, Jureca and Galileo. Please let me know how I can use the new modules. Thank you.

Related task: #533

pramodk commented 4 years ago

Order of deployment following but make sure the existing module names remain intact (e.g. one which are already with Python2)

pramodk commented 4 years ago

@jorblancoa : this ticket requires python3 neuron module

jorblancoa commented 4 years ago

Hi @clupascu

We have installed the new modules in Jureca and Piz-Daint. Could you please test them and let us know if you encounter any problem?

Jureca

module use /p/project/cvsk25/software-deployment/HBP/jureca-booster/26-02-2020/modules/tcl/linux-centos7-haswell

module load neuron/7.8.0b-serial-python3

* For Cluster:

module --force purge all module use /usr/local/software/jureca/OtherStages module load Architecture/Haswell module load Stages/2019a module load Intel ParaStationMPI/5.2.2-1-mt imkl module load HDF5 Boost Python/3.6.8

module use /p/project/cvsk25/software-deployment/HBP/jureca-cluster/26-02-2020/modules/tcl/linux-centos7-haswell

module load neuron/7.8.0b-serial-python3


**Piz-Daint**

export MODULEPATH=/apps/hbp/ich002/hbp-spack-deployments/modules:$MODULEPATH module use /apps/hbp/ich002/hbp-spack-deployments/softwares/27-02-2020/install/modules/tcl/cray-cnl7-haswell module swap PrgEnv-cray PrgEnv-intel module load daint-mc module load cray-python/3.6.5.7

module load neuron/7.8.0b/intel-serial-python3



Thanks!
alex4200 commented 4 years ago

@clupascu WIll test with python3

clupascu commented 4 years ago

@jorblancoa what about Galileo? On Galileo there is already an installation of neuron7.7?

jorblancoa commented 4 years ago

@clupascu @pramodk is going to take care of Galileo. But before deploying everywhere, we want to be sure if the modules are working for you. If you could test those in Daint or Jureca, that would be great!

Thanks!

clupascu commented 4 years ago

@jorblancoa I tried on Jureca, but it seems that the order of the parameters it is not working anymore.

I was using this before

srun ./x86_64/special 'config-exp1.txt' 'exp1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 -mpi -python fitting.py

Any suggestion?

jorblancoa commented 4 years ago

Could you post the error message or log file so I can have a look at the issue?

clupascu commented 4 years ago

Traceback (most recent call last): File "fitting.py", line 251, in fitting(sys.argv[1],sys.argv[2],sys.argv[3],sys.argv[4],sys.argv[5],sys.argv[6],sys.argv[7]) File "fitting.py", line 53, in fitting singletrace_number = int(singletrace_number) ValueError: invalid literal for int() with base 10: 'True'

This is the error message. Like the arg 'True' is read instead of the arg 3.

clupascu commented 4 years ago

The folder I am working in Jureca is /p/home/jusers/lupascu1/jureca/testneuron+python3/. You can have a look there.

pramodk commented 4 years ago

@clupascu : I am pretty sure NEURON installation is working with Python3:

[kumbhar1@jrl05 ~]$ nrniv -python
NEURON -- VERSION 7.8.0-2-g92a208b+ HEAD (92a208b+) 2019-10-29
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2018
See http://neuron.yale.edu/neuron/credits

loading membrane mechanisms from x86_64/.libs/libnrnmech.so
Additional mechanisms from files

>>>
>>> print "a"
  File "stdin", line 1
    print "a"
            ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("a")?
>>> print ("A")
A
>>>

We are not familiar with the fitting.py but I was adding:

    singletrace_number = int(singletrace_number)
    print("----->", singletrace_number)

and I was testing as:

$ srun -p develbooster -A vsk25 -n 1 nrniv 'config-exp1.txt' 'exp1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 -mpi -python fitting.py
srun: job 8109728 queued and waiting for resources
srun: job 8109728 has been allocated resources
NEURON -- VERSION 7.8.0-2-g92a208b+ HEAD (92a208b+) 2019-10-29
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2018
See http://neuron.yale.edu/neuron/credits

Additional mechanisms from files
 netstims.mod ProbGABAAB_EMS_GEPH_g.mod
-----> 3

So I think there is something in your environment or script?

clupascu commented 4 years ago

I moved a little bit on. I replaced as suggested

srun ./x86_64/special 'config-exp1.txt' 'exp1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 -mpi -python fitting.py

with

srun nrniv 'config-exp1.txt' 'exp1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 -mpi -python fitting.py

and now I get in the output file this warning

Warning: detected user attempt to enable MPI, but MPI support was disabled at build time.

and in the error file

srun: error: jrc6369: tasks 24-31,33-47: Segmentation fault srun: error: jrc6369: task 32: Terminated srun: error: jrc6368: tasks 0-3,6-7,9-10,12,18,20-21,23: Terminated srun: error: jrc6368: tasks 4-5,8,11,13-17,19,22: Segmentation fault srun: Force Terminated job step 8109797.0

pramodk commented 4 years ago

@clupascu : I didn't mean to replace "srun ./x86_64/special" with "srun nrniv". I was just trying to run your example to see if NEURON works.

You still need to use "./x86_64/special" in order to get your local mod files included. nrniv just includes neuron's default mod files.

clupascu commented 4 years ago

With "srun nrniv" I don't have the ValueError: invalid literal for int() with base 10: 'True' error anymore.

clupascu commented 4 years ago

@pramodk have you got some time to look into this issue?

clupascu commented 4 years ago

I have the same error also on PizDaint.

pramodk commented 4 years ago

@clupascu : Haven't looked into this yet. Can you point out directory on Piz-Daint? (in scratch or non-home directory where I can access).

clupascu commented 4 years ago

@pramodk you can find the directory on Piz-Daint here /scratch/snx3000/bp000028/testneuron+python3

pramodk commented 4 years ago

@clupascu : I was looking at this with @jorblancoa and we couldn't find easily the issue. This is what we did:

from neuron import h
pc = h.ParallelContext()

def f(arg):
   id = int(pc.id())
   nhost = int(pc.nhost())
   print ('I am %d of %d'%(id, nhost))
   return arg*arg

pc.runworker()

s = 0
if pc.nhost() == 1:
   for i in range(1, 21):
      s += f(i)
else:
   for i in range(1, 21):
      pc.submit(f, i)
   while pc.working():
      s += pc.pyret()
print (s)

pc.done()
h.quit()

Running with NEURON and Python 3 just works fine:

module use /apps/hbp/ich002/hbp-spack-deployments/softwares/27-02-2020/install/modules/tcl/cray-cnl7-haswell
module load neuron/7.8.0b/intel-python3

bp000174@daint103:~/NRN-37> srun nrniv -python test_worker.py  -mpi
NEURON -- VERSION 7.8.0-2-g92a208b6+ HEAD (92a208b6+) 2019-10-29
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2018
See http://neuron.yale.edu/neuron/credits

2870.0
I am 2 of 8
I am 2 of 8
I am 2 of 8
I am 2 of 8
I am 3 of 8
I am 3 of 8
I am 3 of 8
I am 3 of 8
I am 4 of 8
I am 4 of 8
I am 4 of 8
I am 5 of 8
I am 5 of 8
I am 6 of 8
I am 7 of 8
I am 1 of 8
I am 1 of 8
I am 1 of 8
I am 1 of 8
I am 1 of 8
numprocs=8

So we believe that there is no issue with neuron and python3 installation itself.

srun: error: nid00008: tasks 0,7: Segmentation fault (core dumped)

I added import fitness at top of fitting.py file:

import random
import csv
import math
....
import subprocess
import fitness

Which runs program bit further but still see segfault error later during the execution. Could you check by adding import fitness near top?

I suspect the issue exist in the script somewhere (?) that is only becoming visible with Python3.

Do you have neuron installed on your desktop where you can test this? Otherwise, as I am not entirely familiar with the code, I think one needs to test part of the code and see which functions is causing the segfault error.

We can discuss this further tomorrow if required.

clupascu commented 4 years ago

@pramodk Can you please let me know what was the parallel version to use instead of module load neuron/7.8.0b-serial-python3 on Jureca? It seems that if I use this parameter order

srun nrniv -python 'configA1.txt' 'expA1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 fitting.py -mpi

Neuron is loaded (but is loaded of course several times).

jorblancoa commented 4 years ago

Hi @clupascu If you load the latest modules (25/03) you can find a parallel neuron.

module use /p/project/cvsk25/software-deployment/HBP/jureca-booster/25-03-2020/modules/tcl/linux-centos7-haswell
module load neuron/7.8.0b

Let me know if you have any problems.

clupascu commented 4 years ago

@pramodk and @jorblancoa I modified my code (the segmentation fault was due to one function not available anymore in python3) and my code works perfectly now with the new modules from issue #533 on Jureca booster and PizDaint. Any news on the installation of the same modules on Galileo?

jorblancoa commented 4 years ago

Hi @clupascu Out of curiosity, what was the function not available in python3 causing the core dump? Regarding the deployment in Galileo, we are finishing latest validations in Daint and Jureca, and once everything is properly tested, we will deploy in Galileo.

clupascu commented 4 years ago

@jorblancoa the function not available in python3 causing the core dump was file.

alex4200 commented 4 years ago

Hi, can you please tick the boxes in the top-most main comment to mark the cases that are done?

pramodk commented 4 years ago

As far as I know, all systems are up to date with neuron and python3. See #533.

clupascu commented 4 years ago

I tested neuron7.7 on PizDaint, Jureca and Galileo and everything works perfectly. I am closing this issue.

mmigliore commented 4 years ago

Loading the modules works from my account on ich002, but a user on the ich011 group (used by BSP use cases) sees:

bp000338@daint105:~/Heart_3D/WholeCellSimulation_gpurk30000_StimUp> module load daint-mc cray-python/3.6.5.7 PyExtensions/3.6.5.7-CrayGNU-19.10 bp000338@daint105:~/Heart_3D/WholeCellSimulation_gpurk30000_StimUp> module use /apps/hbp/ich002/hbp-spack-deployments/softwares/25-03-2020/install/modules/tcl/cray-cnl7-haswell ModuleCmd_Use.c(240):ERROR:64: Directory '/apps/hbp/ich002/hbp-spack-deployments/softwares/25-03-2020/install/modules/tcl/cray-cnl7-haswell' not found bp000338@daint105:~/Heart_3D/WholeCellSimulation_gpurk30000_StimUp>

Can you fix this?

On Tue, Apr 7, 2020 at 2:41 AM Pramod Kumbhar notifications@github.com wrote:

As far as I know, all systems are up to date with neuron and python3. See

533 https://github.com/cnr-ibf-pa/hbp-bsp-issues/issues/533.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cnr-ibf-pa/hbp-bsp-issues/issues/525#issuecomment-610109131, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGZUP34UZ4W7UYNYB4BFLITRLJZFNANCNFSM4KUMGM6A .

pramodk commented 4 years ago

@mmigliore : Has user bp000338 used the modules in the past? i.e. can he access to/apps/hbp/ich002?

$  ls -l /apps/hbp/ich002/

Note that the we can change permissions to /apps/hbp/ich002/hbp-spack-deployments but not top level /apps/hbp/ich002/.

mmigliore commented 4 years ago

@mmigliore https://github.com/mmigliore : Has user bp000338 used the modules in the past?

No

i.e. can he access to/apps/hbp/ich002? No. This is going to be a general problem for many users not directly related to HBP. The simulation engines (the latest version of NEURON in this case) must be available for anybody.

~> ls -l /apps/hbp/ich002/

total 227834 drwxr-xr-x 3 bp000037 bp0 512 Oct 17 08:31 antonel -rwxr-xr-x. 1 bp000030 bp0 2266 Jan 30 06:56 BlueConfig.test

Note that the we can change permissions to /apps/hbp/ich002/hbp-spack-deployments but not top level /apps/hbp/ich002/ .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cnr-ibf-pa/hbp-bsp-issues/issues/525#issuecomment-611598364, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGZUP345CEGRM354IBASI7DRLXUFHANCNFSM4KUMGM6A .

pramodk commented 4 years ago

i.e. can he access to/apps/hbp/ich002?

No. This is going to be a general problem for many users not directly related to HBP.

Ok, this is something new then.