nest / nest-simulator

The NEST simulator
http://www.nest-simulator.org
GNU General Public License v2.0
541 stars 367 forks source link

MyModule does not work on BlueGene #28

Closed heplesser closed 9 years ago

heplesser commented 9 years ago

User modules (MyModule) do not work on BlueGene at present. Loading them as dynamically linked modules fails as well as building NEST with MyModule linked in. This is most likely due to the fact that the MyModule build setup does not take BlueGene peculiarities into account.

heplesser commented 9 years ago

Workaround

Copy the code for models from the user module into the models directory, add the files to models/Makefile.am and the pertaining #includes and register_model<>() and register_connector_model<>() lines to models/modelsmodule.cpp.

apeyser commented 9 years ago

Has this proposition been generally tested, or is it a generalization from one case?

heplesser commented 9 years ago

Only one user has tried to use NEST with his own module on BlueGene so far (to my knowledge), and has failed. The user could not load the module with Install, and when trying to compile NEST with that module, the build ended with the error message below. Note that libtool "translates" all .la libraries to .a files, while for libmymodule.la it injects libmymodule.so into final compiler/linker call.

Note that the configure.ac for MyModule does not take into account BlueGene settings and static/shared library settings from the NEST configuration process. So it seems plausible to me that the problem lies with the all-to-simple configure.ac for MyModule.

gmake[1]: Entering directory `/homea/??????????/NEST/nest-2.6.0/bld/nest'
source='../../src/nest/main.cpp' object='main.o' libtool=no \
        DEPDIR=.deps depmode=xlc /bin/sh ../../src/depcomp \
        mpixlcxx_r -DHAVE_CONFIG_H -I. -I../../src/nest -I../libnestutil  -I../../src/libnestutil -I../../src/librandom -I../../src/sli -I../../src/nestkernel -I/bgsys/local/gsl/1.15_O3g/include     -O2 -I/bgsys/drivers/ppcfloor/arch/include -qsmp=omp  -c -o main.o ../../src/nest/main.cpp
source='../../src/nest/neststartup.cpp' object='neststartup.o' libtool=no \
        DEPDIR=.deps depmode=xlc /bin/sh ../../src/depcomp \
        mpixlcxx_r -DHAVE_CONFIG_H -I. -I../../src/nest -I../libnestutil  -I../../src/libnestutil -I../../src/librandom -I../../src/sli -I../../src/nestkernel -I/bgsys/local/gsl/1.15_O3g/include     -O2 -I/bgsys/drivers/ppcfloor/arch/include -qsmp=omp  -c -o neststartup.o ../../src/nest/neststartup.cpp
    1500-030: (I) INFORMATION: Processes::Processes(): Additional optimization may be attained by recompiling and specifying MAXMEM option with a value greater than 8192.
/bin/sh ../libtool  --tag=CXX   --mode=link mpixlcxx_r -O2 -I/bgsys/drivers/ppcfloor/arch/include -qsmp=omp  -export-dynamic   -o nest main.o neststartup.o /homea/??????????/NEST/nest-2.6.0/bld/install/lib/nest/libmymodule.la /homea/??????????/NEST/nest-2.6.0/bld/models/libmodelsmodule.la /homea/??????????/NEST/nest-2.6.0/bld/precise/libprecisemodule.la /homea/??????????/NEST/nest-2.6.0/bld/topology/libtopologymodule.la ../nestkernel/libnest.la ../librandom/librandom.la ../libnestutil/libnestutil.la ../sli/libsli.la -L/bgsys/local/gsl/1.15_O3g/lib -lgsl -lgslcblas -lm
libtool: link: mpixlcxx_r -O2 -I/bgsys/drivers/ppcfloor/arch/include -qsmp=omp -o nest main.o neststartup.o -Wl,--export-dynamic  /homea/??????????/NEST/nest-2.6.0/bld/install/lib/nest/libmymodule.so /homea/??????????/NEST/nest-2.6.0/bld/models/.libs/libmodelsmodule.a /homea/??????????/NEST/nest-2.6.0/bld/precise/.libs/libprecisemodule.a /homea/??????????/NEST/nest-2.6.0/bld/topology/.libs/libtopologymodule.a ../nestkernel/.libs/libnest.a ../librandom/.libs/librandom.a ../libnestutil/.libs/libnestutil.a ../sli/.libs/libsli.a -L/bgsys/local/gsl/1.15_O3g/lib /bgsys/local/gsl/1.15_O3g/lib/libgsl.a /bgsys/local/gsl/1.15_O3g/lib/libgslcblas.a -lm -qsmp=omp -Wl,-rpath -Wl,/homea/??????????/NEST/nest-2.6.0/bld/install/lib/nest -Wl,-rpath -Wl,/homea/??????????/NEST/nest-2.6.0/bld/install/lib/nest
/bgsys/drivers/ppcfloor/gnu-linux/powerpc64-bgq-linux/bin/ld: attempted static link of dynamic object `/homea/??????????/NEST/nest-2.6.0/bld/install/lib/nest/libmymodule.so'
gmake[1]: *** [nest] Error 1

I have obfuscated user-identifying information above.

heplesser commented 9 years ago

@apeyser I have worked on this for the better part of today and now need advice from someone who knows Juqueen better than me.

User modules can be added to NEST in two ways: as dynamically loaded modules ((mymodule) Install at runtime load the module) or by linking in libmymodule.so at compile time. The module is then ready for use when NEST has started. On Linux and OSX, libtool handles this fine. By default, mymodule.{a,so} and libmymodule.{a,so} are built (dylib on OSX) and installed to $prefix/install/lib/nestand loaded or linked from there.

Trying to load the module at run time fails on Juqueen. The dynamic loader reports "file not found", although I am rather certain that I have set all paths correctly.

When trying to link the module (--with-modules=mymodule), linking the NEST executable fails, because libtool adds both libmodule.so and libmodule.a to the list of libraries to link and the linker than complains that I am trying to link a dynamic object statically. Deleting the so file from the arguments allows one to link. PyNEST builds without problems thereafter---and it actually works; this is by all likelihood because the PyNEST link process manages to handle the so-file.

I then revised the configuration process so that only the static libmymodule.a would be built on Bluegene, and it links nicely (see the fix28... branch in my fork). Unfortunately, it does not work: the module is linked, but not registered with the NEST engine, and so NEST never knows that the module is there. The only way to initialise a static module without any dlopening is to turn it from a DynModule into a SLIModule (and I have come some way on this in my branch). Now, such modules could only be linked statically, and we don't want that in the general case. Thus, the only way out would be a deal of IS_BLUEGENE or BUILD_STATIC_MODULE conditionals in the code. That is doable, but rather ugly.

So I hope I have overlooked some things about shared libraries on BlueGene machines so one could get dynamically loaded modules to work.

tammoippen commented 9 years ago

From my work on Juqueen I know, that NEST compiles as a static application with no dynamic loading of libraries possible. Further, it is discouraged to dynamically link libraries on Juqueen at all, because every compute node has to load the library over the network, which will increase the complete execution time. So I think the way to go is use --with-modules=mymodule and statically link the module.

apeyser commented 9 years ago

The juqueen linkers and compilers will compile static rather than dynamic versions even with normal dynamic linker flags --- you have to really insist on compiling dynamically. I'll post here the flags after looking it up, since I had to do this to get PyNest to work.

It's discouraged as a waste of cycles to load dynamic modules --- but since this isn't the cycle bottleneck, as long as it's not ridiculous (as it was with python < 3.4), then I'd really first worry about the more outrageous waste of cycles.

apeyser commented 9 years ago
-Xlinker -export-dynamic -dynamic

are needed for the linker to build dynamic libraries. How to convince NEST to use this without blowing up the configure script is an exercise for the student.

Also, go for gcc rather than the ibm compilers, given the extra pain there.

heplesser commented 9 years ago

@apeyser Thanks for your help. Actually, I found a rather painless way to build and link modules as proper static libraries on BlueGene in the end. I will file a PR shortly.