neperfepx / neper

Polycrystal generation and meshing
http://neper.info
GNU General Public License v3.0
199 stars 53 forks source link

Half of the testsuite and "make install" fails with neper 4.4.0 #402

Closed samfux84 closed 2 years ago

samfux84 commented 2 years ago

Describe the bug

I would like to report two problems.

1.) More than half of the testsuite fails:

45% tests passed, 170 tests failed out of 307

2.) make install fails:

-- Install configuration: "Release"
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper
CMake Error at cmake_install.cmake:50 (file):
  file RPATH_CHANGE could not write new RPATH:

  to the file:

    /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper

  Error opening file for update.

make: *** [Makefile:77: install] Error 1

I am trying to compile neper 4.4.0 with:

Configuring the build with ccmake works fine and compilation does not give any errors. I use CMAKE_INSTALL_PREFIX=/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64 to install the software in user space (nothing should go to /usr, because this would require root privileges to write there).

After running "make", I have to edit the cmake_install.cmake file, to change the path, where the neper-completion.bash file gets stored (as I don't want the installation to write anything to /usr as I explicitely specified the CMAKE_INSTALL_PREFIX):

if("x${CMAKE_INSTALL_COMPONENT}x" STREQUAL "xUnspecifiedx" OR NOT CMAKE_INSTALL_COMPONENT)
  list(APPEND CMAKE_ABSOLUTE_DESTINATION_FILES
   "/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/bash-completion/completions/neper")
  if(CMAKE_WARN_ON_ABSOLUTE_INSTALL_DESTINATION)
    message(WARNING "ABSOLUTE path INSTALL DESTINATION : ${CMAKE_ABSOLUTE_DESTINATION_FILES}")
  endif()
  if(CMAKE_ERROR_ON_ABSOLUTE_INSTALL_DESTINATION)
    message(FATAL_ERROR "ABSOLUTE path INSTALL DESTINATION forbidden (by caller): ${CMAKE_ABSOLUTE_DESTINATION_FILES}")
  endif()
file(INSTALL DESTINATION "/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/bash-completion/completions" TYPE FILE RENAME "neper" FILES "/scratch/207525591.tmpdir/neper-4.4.0/src/contrib/bashcomp/neper-completion.bash")
endif()

Then I run the testsuite to check the installation. For tests 1-113, only 1 test fails:

65/307 Test #65: T/morpho_tocta ...................***Failed 0.03 sec

Command: "/cluster/apps/gcc-6.3.0/cmake-3.16.5-63ad2dwk4yik57xicg7lx5kraihw6ybh/bin/cmake" "-Dtest_prog=/scratch/207525591.tmpdir/neper-4.4.0/src/build/neper" "-Dtest_mode=Normal" "-P" "/scratch/207525591.tmpdir/neper-4.4.0/src/../tests/T/morpho_tocta/test.cmake"
Directory: /scratch/207525591.tmpdir/neper-4.4.0/src/../tests/T/morpho_tocta
"T/morpho_tocta" start time: Mar 08 09:15 CET
Output:
----------------------------------------------------------

========================    N   e   p   e   r    =======================
Info   : A software package for polycrystal generation and meshing.
Info   : Version 4.4.0
Info   : Built with: gsl|muparser|opengjk|openmp|nlopt|libscotch (full)
Info   : Running on 1 threads.
Info   : <https://neper.info>
Info   : Copyright (C) 2003-2022, and GNU GPL'd, by Romain Quey.
Info   : Ignoring initialization file.
Info   : ---------------------------------------------------------------
Info   : MODULE  -T loaded with arguments:
Info   : [ini file] (none)
Info   : [com line] -n from_morpho -morpho tocta(2) -o test
Info   : ---------------------------------------------------------------
Info   : Reading input data...
Info   : Creating domain...
Info   : Creating tessellation...
Info   :   - Setting seeds...
Info   :   - Generating crystal orientations...
Info   :   - Running tessellation...
Info   : Writing results...
Info   :     [o] Writing file `test.tess'...
Info   :     [o] Wrote file `test.tess'.
Info   : Elapsed time: 0.007 secs.
========================================================================

CMake Error at /scratch/207525591.tmpdir/neper-4.4.0/tests/test.cmake:32 (message):
  Test failed - files differ
Call Stack (most recent call first):
  test.cmake:6 (include)

<end of output>
Test time =   0.03 sec
----------------------------------------------------------
Test Failed.
"T/morpho_tocta" end time: Mar 08 09:15 CET
"T/morpho_tocta" time elapsed: 00:00:00
----------------------------------------------------------

But then the majority of tests 114-307 fail and there are two predominant error messages.

30 tests fail with

100% (0.38|0.75/100%| 0%| 0%)Info   :   - Fixing 2D-mesh pinches...
Info   :   - 3D meshing... Error  : Wrong mesh dimension: -1!
CMake Error at /scratch/207525591.tmpdir/neper-4.4.0/tests/test.cmake:8 (message):
  Test failed
Call Stack (most recent call first):
  test.cmake:6 (include)

and 134 tests fail with

Info   :     [o] Writing file `test.png'...
Error  :     > File `test.png' could not be generated!
CMake Error at /scratch/207525591.tmpdir/neper-4.4.0/tests/test.cmake:8 (message):
  Test failed
Call Stack (most recent call first):
  test.cmake:6 (include)

After running the testsuite, I still wanted to test if "make install" works, but it fails with

Install the project...
/cluster/apps/gcc-6.3.0/cmake-3.16.5-63ad2dwk4yik57xicg7lx5kraihw6ybh/bin/cmake -P cmake_install.cmake
-- Install configuration: "Release"
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper
CMake Error at cmake_install.cmake:50 (file):
  file RPATH_CHANGE could not write new RPATH:

  to the file:

    /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper

  Error opening file for update.

make: *** [Makefile:77: install] Error 1

Are some of these issues known, and is there any workaround to resolve them?

Any help is appreciated.

Best regards

Sam

rquey commented 2 years ago

For what are the failed tests, I think that you are missing POV-Ray and that, for some reasons, your Gmsh isn't working properly. Neper does not use a parallel version of Gmsh; so, this may be what the problem is?

Your report on the make install issue is interesting, but let's try to solve the dependency issue first.

samfux84 commented 2 years ago

@rquey Thank you for your reply. You are right, that in the first to run the testsuite povray was not in the path. I will build Neper 4.4.0 again with the same dependencies and make sure that povray is in the $PATH and then try again to run the testsuite and report the outcome here.

samfux84 commented 2 years ago

I have rerun the testsuite with povray in the $PATH and on the first try still most tests were failing. Then I noticed that there was an issue with my povray installation from source. Accidentally -march=native was used and the software was then run on a node that was older than the one I compiled povray on. After fixing this issue, I have rebuilt povray such that it now works fine. With this problem fixed I tried again to rerun the testsuite. Looks already much better:

88% tests passed, 36 tests failed out of 307

There is still the test 65 failing and the 30 ones with the mesh dimension error. But the 134 test that failed before with the png error have now all passed, except 5, which also have a "files differ" error.

As a next step, I will rebuild a serial gmsh 4.4.1 version and build Neper 4.4.0 again and then run the testsuite again and report the outcome here.

Thank you for your help, this is appreciated a lot.

Best regards

Sam

samfux84 commented 2 years ago

I rebuilt Neper 4.4.0 with a serial gmsh 4.4.1 version that I just built from source. With this, I was rerunning the testsuite again. Same result as before.

Gmsh has quite some buildoptions:

I built it with:

Is FLTK, PETSc, Tetgen, Netgen, SLEPc support required for using GMSH with Neper?

The tests fail in the 3D meshing part:

118/307 Testing: M/clmin
118/307 Test: M/clmin
Command: "/cluster/apps/gcc-6.3.0/cmake-3.16.5-63ad2dwk4yik57xicg7lx5kraihw6ybh/bin/cmake" "-Dtest_prog=/scratch/207579636.tmpdir/neper-4.4.0/src/build/neper" "-Dtest_mode=Normal" "-P" "/scratch/207579636.tmpdir/neper-4.4.0/src/../tests/M/clmin/test.cmake"
Directory: /scratch/207579636.tmpdir/neper-4.4.0/src/../tests/M/clmin
"M/clmin" start time: Mar 08 14:57 CET
Output:
----------------------------------------------------------

========================    N   e   p   e   r    =======================
Info   : A software package for polycrystal generation and meshing.
Info   : Version 4.4.0
Info   : Built with: gsl|muparser|opengjk|openmp|nlopt|libscotch (full)
Info   : Running on 1 threads.
Info   : <https://neper.info>
Info   : Copyright (C) 2003-2022, and GNU GPL'd, by Romain Quey.
Info   : Ignoring initialization file.
Info   : ---------------------------------------------------------------
Info   : MODULE  -M loaded with arguments:
Info   : [ini file] (none)
Info   : [com line] n2-id1.tess -mesh3dclreps 1 -clmin 0.5 -o test
Info   : ---------------------------------------------------------------
Info   : Reading input data...
Info   :   - Reading arguments...
Info   : Loading input data...
Info   :   - Loading tessellation...
Info   :     [i] Parsing file `n2-id1.tess'...
Info   :     [i] Parsed file `n2-id1.tess'.
Info   : Meshing...
Info   :   - Preparing... (cl = 0.3969)
  0%
  8%
 17%
 25%
 33%
 42%
 50%
 58%
 67%
 75%
 83%
 92%
100%Info   :   - 0D meshing...
  0%
  8%
 15%
 23%
 31%
 38%
 46%
 54%
 62%
 69%
 77%
 85%
 92%
100%
Info   :   - 1D meshing...
  0%
  5%
  9%
 14%
 18%
 23%
 27%
 32%
 36%
 41%
 45%
 50%
 55%
 59%
 64%
 68%
 73%
 77%
 82%
 86%
 91%
 95%
100%Info   :   - 2D meshing...
  0% (0|0/ 0%| 0%| 0%)
  8% (0.89|0.89/100%| 0%| 0%)
 17% (0.89|0.89/100%| 0%| 0%)
 25% (0.83|0.87/100%| 0%| 0%)
 33% (0.83|0.88/100%| 0%| 0%)
 42% (0.83|0.88/100%| 0%| 0%)
 50% (0.83|0.89/100%| 0%| 0%)
 58% (0.83|0.89/100%| 0%| 0%)
 67% (0.83|0.89/100%| 0%| 0%)
 75% (0.83|0.88/89%| 0%|11%)
 83% (0.83|0.89/90%| 0%|10%)
 92% (0.44|0.84/91%| 0%| 9%)
100% (0.44|0.85/92%| 0%| 8%)Info   :   - Fixing 2D-mesh pinches...
Info   :   - 3D meshing... Error  : Wrong mesh dimension: -1!
CMake Error at /scratch/207579636.tmpdir/neper-4.4.0/tests/test.cmake:8 (message):
  Test failed
Call Stack (most recent call first):
  test.cmake:6 (include)

<end of output>
Test time =   2.59 sec
----------------------------------------------------------
Test Failed.
"M/clmin" end time: Mar 08 14:57 CET
"M/clmin" time elapsed: 00:00:02
----------------------------------------------------------

I am not a GMSH user and in the GMSH documentation it is difficult to find information about which 3D meshing algorithms need which particular dependency.

Best regards

Sam

rquey commented 2 years ago

Yes, you need ENABLE_FLTK and ENABLE_NETGEN set to ON.

(After more thoughts, ENABLE_FLTK is not strictly needed (it is needed for the GUI), but ENABLE_NETGEN is).

samfux84 commented 2 years ago

Thank you for this valuable information. I already have some FLTK 1.3.3 at hand and building netgen should not be too difficult. I will get back to you once I did all the compilation work and can run the testsuite again.

rquey commented 2 years ago

I am under the impression that Gmsh would use the Netgen that is in its contrib directory. What if you just switch ENABLE_NETGEN on / configure and compile again?

Please keep me posted: version 4.4.1 is in the starting blocks; I just need to know if I should do something related to this issue.

samfux84 commented 2 years ago

Just tested your suggestion with -DENABLE_NETGEN. Neper tests are still failing, but now the

Info : - 3D meshing... Error : Wrong mesh dimension: -1!

errors are gone and the same tests now only fail with "files differ", so your suggestions actually worked.

I need to stop for today, but I will continue to work on this tomorrow and keep you posted.

samfux84 commented 2 years ago

Some things to consider regarding neper 4.4.1:

CMake Error at cmake_install.cmake:50 (file):
  file RPATH_CHANGE could not write new RPATH:

  to the file:

    /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper

  Error opening file for update.

make: *** [Makefile:77: install] Error 1

[sfux@eu-ms-001-01 build]$ ls -ltr /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper
-r-xr-xr-x 1 sfux ID-HPC-APPS 3071576 Mar  8 17:14 /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper
rquey commented 2 years ago

Some things to consider regarding neper 4.4.1:

Please consider changing cmake_install.cmake which is created in the "build" such that the bash completion files are installed to CMAKE_INSTALL_PREFIX/share instead of /usr/share when CMAKE_INSTALL_PREFIX is defined

I cannot do this (because CMAKE_INSTALL_PREFIX points to /usr/local, not /usr), but I have created a new variable, CMAKE_INSTALL_PREFIX_SHARE, that you can modify (default /usr/share).

The problem with the RPATH change most likely stems from installing the neper binary with permission 555 instead of 755. This happened even though I have explicitly set the umask to 0022:

[...]

I can do this, but is setting to 755 a common practice?

Make install does not seem to install the scotch libraries from the contrib directory, but I will have to double check this again once I could somehow work around the RPATH_CHANGE problem

This doesn't need to be, afaik.

Can you give a try to CMakeLists.txt (to put in src/), which should fix all this?

samfux84 commented 2 years ago

Thank you very much for your excellent support. I have tested the CMakeLists.txt that you provide and it fixes the issue with the bash completion file (thank you for introducing the new CMake variable). This is important on shared systems like an HPC cluster, where often software is installed in user space by a technical user that does not have root privileges.

Now the make install step also works:

Install the project...
/cluster/apps/gcc-6.3.0/cmake-3.16.5-63ad2dwk4yik57xicg7lx5kraihw6ybh/bin/cmake -P cmake_install.cmake
-- Install configuration: "Release"
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper
-- Set runtime path of "/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper" to ""
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/bash-completion/completions/neper
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/pkgconfig/nlopt.pc
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/include/nlopt.h
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/include/nlopt.hpp
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/include/nlopt.f
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/man/man3/nlopt.3
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/man/man3/nlopt_minimize.3
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/man/man3/nlopt_minimize_constrained.3
[sfux@eu-ms-003-25 build]$

For single user systems, the permissions probably don't matter that much as people often install software as root. On shared systems permission 755 for executables is quite common. If you prefer to keep the permissions as before, then this is perfectly fine. With your new CMakeLists.txt I can check where this is done and manually add it to my build in case you don't include this in the new release.

Regarding my last point. People often build software in a staged way. The source code is downloaded on a temporary storage location, as all the important parts are usually installed with the "make install" command to the location CMAKE_INSTALL_PREFIX, such that afterwards the source directory on the temporary storage location can be deleted.

As you can see from the excerpt of the logs for the "make install" step above, the neper binary is install and some header files are copied to the include directory and some files are installed in the share directory, but almost nothing is installed in the lib64 directory.

When I now run an ldd command on the neper binary, then it won't find the scotch library and the nlopt library, as those are in the contrib directory and were not installed in the lib64 subdirectory of the installation directory:


[sfux@eu-ms-003-25 x86_64]$ pwd
/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64
[sfux@eu-ms-003-25 x86_64]$ module list

Currently Loaded Modules:
  1) StdEnv   2) gcc/6.3.0   3) gsl/2.6   4) gmsh/4.4.1   5) povray/3.7.0.8   6) eth_proxy   7) cmake/3.16.5

[sfux@eu-ms-003-25 x86_64]$ ldd bin/neper
        linux-vdso.so.1 =>  (0x00007fff7ef4e000)
        libgsl.so.25 => /cluster/apps/gcc-6.3.0/gsl-2.6-3vydnc2j2ntjzjipu3hypkcioa7trgdt/lib/libgsl.so.25 (0x00002b0b54b13000)
        libgslcblas.so.0 => /cluster/apps/gcc-6.3.0/gsl-2.6-3vydnc2j2ntjzjipu3hypkcioa7trgdt/lib/libgslcblas.so.0 (0x00002b0b54ff8000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b0b55239000)
        libm.so.6 => /lib64/libm.so.6 (0x00002b0b55455000)
        libscotch.so => not found
        libscotcherr.so => not found
        libscotcherrexit.so => not found
        libnlopt.so.0 => not found
        libstdc++.so.6 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libstdc++.so.6 (0x00002b0b55757000)
        libgomp.so.1 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libgomp.so.1 (0x00002b0b55ad8000)
        libgcc_s.so.1 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libgcc_s.so.1 (0x00002b0b55d06000)
        libc.so.6 => /lib64/libc.so.6 (0x00002b0b55f1d000)
        /lib64/ld-linux-x86-64.so.2 (0x00002b0b548ef000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00002b0b562eb000)
[sfux@eu-ms-003-25 x86_64]$

Since in a staged build, the source code directory is removed after building the software, it would be important to also copy the libraries that were built in the contrib directory to the lib64 subdirectory of the installation directory.

[sfux@eu-ms-003-25 x86_64]$ cd lib64/
[sfux@eu-ms-003-25 lib64]$ ls
pkgconfig
[sfux@eu-ms-003-25 lib64]$ cp $TMPDIR/neper-4.4.0/src/build/contrib/nlopt/*.so* .
[sfux@eu-ms-003-25 lib64]$ cp $TMPDIR/neper-4.4.0/src/build/contrib/scotch/*.so* .
[sfux@eu-ms-003-25 lib64]$ export LD_LIBRARY_PATH=/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64:$LD_LIBRARY_PATH
[sfux@eu-ms-003-25 lib64]$ ldd ../bin/neper
        linux-vdso.so.1 =>  (0x00007ffcda3c6000)
        libgsl.so.25 => /cluster/apps/gcc-6.3.0/gsl-2.6-3vydnc2j2ntjzjipu3hypkcioa7trgdt/lib/libgsl.so.25 (0x00002ad32a899000)
        libgslcblas.so.0 => /cluster/apps/gcc-6.3.0/gsl-2.6-3vydnc2j2ntjzjipu3hypkcioa7trgdt/lib/libgslcblas.so.0 (0x00002ad32ad7e000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00002ad32afbf000)
        libm.so.6 => /lib64/libm.so.6 (0x00002ad32b1db000)
        libscotch.so => /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/libscotch.so (0x00002ad32b4dd000)
        libscotcherr.so => /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/libscotcherr.so (0x00002ad32b77d000)
        libscotcherrexit.so => /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/libscotcherrexit.so (0x00002ad32b97f000)
        libnlopt.so.0 => /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/libnlopt.so.0 (0x00002ad32bb81000)
        libstdc++.so.6 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libstdc++.so.6 (0x00002ad32be25000)
        libgomp.so.1 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libgomp.so.1 (0x00002ad32c1a6000)
        libgcc_s.so.1 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libgcc_s.so.1 (0x00002ad32c3d4000)
        libc.so.6 => /lib64/libc.so.6 (0x00002ad32c5eb000)
        /lib64/ld-linux-x86-64.so.2 (0x00002ad32a675000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00002ad32c9b9000)
[sfux@eu-ms-003-25 lib64]$

After copying the missing libraries and adding the lib64 directory to $LD_LIBRARY_PATH (this step will be done in the neper/4.4.0 module that I will create for the installation), the neper binary finds all required libraries.

The neper 4.4.0 installation on the HPC cluster of our university will from now on be available to all of our 3200 cluster users.

Again thank you very much for being that responsive and for all the help with building neper. This is appreciated a lot.

Best regards

Sam

samfux84 commented 2 years ago

Just a last question regarding the 36 failed tests, which state

"files differ"

Is there a way to check how much the difference between the actual test and the reference result is?

rquey commented 2 years ago

See inserts below.

Thank you very much for your excellent support. I have tested the CMakeLists.txt that you provide and it fixes the issue with the bash completion file (thank you for introducing the new CMake variable). This is important on shared systems like an HPC cluster, where often software is installed in user space by a technical user that does not have root privileges.

Okay. I've just renamed it to CMAKE_INSTALL_PREFIX_COMPLETION.

Now the make install step also works:

Install the project...
/cluster/apps/gcc-6.3.0/cmake-3.16.5-63ad2dwk4yik57xicg7lx5kraihw6ybh/bin/cmake -P cmake_install.cmake
-- Install configuration: "Release"
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper
-- Set runtime path of "/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/bin/neper" to ""
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/bash-completion/completions/neper
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/pkgconfig/nlopt.pc
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/include/nlopt.h
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/include/nlopt.hpp
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/include/nlopt.f
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/man/man3/nlopt.3
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/man/man3/nlopt_minimize.3
-- Installing: /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/share/man/man3/nlopt_minimize_constrained.3
[sfux@eu-ms-003-25 build]$

For single user systems, the permissions probably don't matter that much as people often install software as root. On shared systems permission 755 for executables is quite common. If you prefer to keep the permissions as before, then this is perfectly fine. With your new CMakeLists.txt I can check where this is done and manually add it to my build in case you don't include this in the new release.

Thanks. I'll use 755.

Regarding my last point. People often build software in a staged way. The source code is downloaded on a temporary storage location, as all the important parts are usually installed with the "make install" command to the location CMAKE_INSTALL_PREFIX, such that afterwards the source directory on the temporary storage location can be deleted.

As you can see from the excerpt of the logs for the "make install" step above, the neper binary is install and some header files are copied to the include directory and some files are installed in the share directory, but almost nothing is installed in the lib64 directory.

When I now run an ldd command on the neper binary, then it won't find the scotch library and the nlopt library, as those are in the contrib directory and were not installed in the lib64 subdirectory of the installation directory:


[sfux@eu-ms-003-25 x86_64]$ pwd
/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64
[sfux@eu-ms-003-25 x86_64]$ module list

Currently Loaded Modules:
  1) StdEnv   2) gcc/6.3.0   3) gsl/2.6   4) gmsh/4.4.1   5) povray/3.7.0.8   6) eth_proxy   7) cmake/3.16.5

[sfux@eu-ms-003-25 x86_64]$ ldd bin/neper
        linux-vdso.so.1 =>  (0x00007fff7ef4e000)
        libgsl.so.25 => /cluster/apps/gcc-6.3.0/gsl-2.6-3vydnc2j2ntjzjipu3hypkcioa7trgdt/lib/libgsl.so.25 (0x00002b0b54b13000)
        libgslcblas.so.0 => /cluster/apps/gcc-6.3.0/gsl-2.6-3vydnc2j2ntjzjipu3hypkcioa7trgdt/lib/libgslcblas.so.0 (0x00002b0b54ff8000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b0b55239000)
        libm.so.6 => /lib64/libm.so.6 (0x00002b0b55455000)
        libscotch.so => not found
        libscotcherr.so => not found
        libscotcherrexit.so => not found
        libnlopt.so.0 => not found
        libstdc++.so.6 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libstdc++.so.6 (0x00002b0b55757000)
        libgomp.so.1 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libgomp.so.1 (0x00002b0b55ad8000)
        libgcc_s.so.1 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libgcc_s.so.1 (0x00002b0b55d06000)
        libc.so.6 => /lib64/libc.so.6 (0x00002b0b55f1d000)
        /lib64/ld-linux-x86-64.so.2 (0x00002b0b548ef000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00002b0b562eb000)
[sfux@eu-ms-003-25 x86_64]$

Since in a staged build, the source code directory is removed after building the software, it would be important to also copy the libraries that were built in the contrib directory to the lib64 subdirectory of the installation directory.

[sfux@eu-ms-003-25 x86_64]$ cd lib64/
[sfux@eu-ms-003-25 lib64]$ ls
pkgconfig
[sfux@eu-ms-003-25 lib64]$ cp $TMPDIR/neper-4.4.0/src/build/contrib/nlopt/*.so* .
[sfux@eu-ms-003-25 lib64]$ cp $TMPDIR/neper-4.4.0/src/build/contrib/scotch/*.so* .
[sfux@eu-ms-003-25 lib64]$ export LD_LIBRARY_PATH=/cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64:$LD_LIBRARY_PATH
[sfux@eu-ms-003-25 lib64]$ ldd ../bin/neper
        linux-vdso.so.1 =>  (0x00007ffcda3c6000)
        libgsl.so.25 => /cluster/apps/gcc-6.3.0/gsl-2.6-3vydnc2j2ntjzjipu3hypkcioa7trgdt/lib/libgsl.so.25 (0x00002ad32a899000)
        libgslcblas.so.0 => /cluster/apps/gcc-6.3.0/gsl-2.6-3vydnc2j2ntjzjipu3hypkcioa7trgdt/lib/libgslcblas.so.0 (0x00002ad32ad7e000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00002ad32afbf000)
        libm.so.6 => /lib64/libm.so.6 (0x00002ad32b1db000)
        libscotch.so => /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/libscotch.so (0x00002ad32b4dd000)
        libscotcherr.so => /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/libscotcherr.so (0x00002ad32b77d000)
        libscotcherrexit.so => /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/libscotcherrexit.so (0x00002ad32b97f000)
        libnlopt.so.0 => /cluster/apps/nss/gcc-6.3.0/neper/4.4.0/x86_64/lib64/libnlopt.so.0 (0x00002ad32bb81000)
        libstdc++.so.6 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libstdc++.so.6 (0x00002ad32be25000)
        libgomp.so.1 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libgomp.so.1 (0x00002ad32c1a6000)
        libgcc_s.so.1 => /cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64/libgcc_s.so.1 (0x00002ad32c3d4000)
        libc.so.6 => /lib64/libc.so.6 (0x00002ad32c5eb000)
        /lib64/ld-linux-x86-64.so.2 (0x00002ad32a675000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00002ad32c9b9000)
[sfux@eu-ms-003-25 lib64]$

After copying the missing libraries and adding the lib64 directory to $LD_LIBRARY_PATH (this step will be done in the neper/4.4.0 module that I will create for the installation), the neper binary finds all required libraries.

Thanks. Different things here. Neper includes its own versions of nlopt and scotch, but will use the system versions if they are available. From your reports, I think that Neper is using the built-in versions. In that case, the intended behavior is that Neper compiles/links its built-in nlopt and scotch as static libraries (so, ldd neper will not show nlopt and scotch), and that no header or library files are copied to system (or other) locations upon make install. In system versions are used, then Neper just uses the system versions and nothing needs to be copied either. (We did this to ease the installation process).

I have modified the CmakeLists.txt files accordingly, as available in branch fix_cmake. It works as expected on my computer. Would you like to give it a try?

The neper 4.4.0 installation on the HPC cluster of our university will from now on be available to all of our 3200 cluster users.

Install 4.4.1 when it's out! :)

Again thank you very much for being that responsive and for all the help with building neper. This is appreciated a lot.

Best regards

Sam

rquey commented 2 years ago

Just a last question regarding the 36 failed tests, which state

"files differ"

Is there a way to check how much the difference between the actual test and the reference result is?

A test*.bak file is created when a test fails. You could compare it to the corresponding ref* file (in the test directory). It may simply be that you are using a Gmsh version lower than the one used to build the reference test files. You can also run the tests in Minimal mode (cmake -DBUILD_TESTING_MODE=Minimal ..), which will only check if Neper runs successively (no file check).

Which tests are failing?

samfux84 commented 2 years ago

I just reran the testsuite and the following tests are failing (I think it is due to the different GMSH version that I use):

88% tests passed, 37 tests failed out of 307

Total Test time (real) = 398.83 sec

The following tests FAILED:
         65 - T/morpho_tocta (Failed)
        114 - M/cl (Failed)
        115 - M/cl_expr (Failed)
        118 - M/clmin (Failed)
        119 - M/clratio (Failed)
        120 - M/dim (Failed)
        121 - M/dim2 (Failed)
        122 - M/dim_expr (Failed)
        123 - M/faset (Failed)
        126 - M/interface (Failed)
        127 - M/interface2 (Failed)
        128 - M/interface3 (Failed)
        129 - M/mesh2dalgo_dela (Failed)
        130 - M/mesh2dalgo_fron (Failed)
        131 - M/mesh2dalgo_mead (Failed)
        132 - M/mesh2dalgo_netg (Failed)
        133 - M/mesh3dalgo_netggmne (Failed)
        134 - M/mesh3dalgo_netggmsh (Failed)
        135 - M/mesh3dalgo_netgnetg (Failed)
        136 - M/meshing (Failed)
        138 - M/meshqualdisexpr (Failed)
        139 - M/meshqualexpr (Failed)
        140 - M/meshqualmin (Failed)
        141 - M/nset1 (Failed)
        142 - M/nset2 (Failed)
        143 - M/nset3 (Failed)
        144 - M/order (Failed)
        145 - M/order_dim2 (Failed)
        149 - M/part2 (Failed)
        150 - M/part_dim2 (Failed)
        151 - M/pl (Failed)
        152 - M/rcl (Failed)
        153 - M/rcl_expr (Failed)
        156 - M/remesh1 (Failed)
        157 - M/remesh2 (Failed)
        158 - M/remesh3 (Failed)
        170 - M/tesr_dim2 (Failed)
Errors while running CTest
make: *** [Makefile:155: test] Error 8

All those tests fail with "files differ":

[sfux@eu-ms-001-01 build]$ grep failed Testing/Temporary/LastTest.log
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
  Test failed - files differ
[sfux@eu-ms-001-01 build]$
rquey commented 2 years ago

Thanks. I'm curious about T/morpho_tocta. Could you provide the *.bak file?

samfux84 commented 2 years ago

Please find attached the test.tes.bak file for the T/morpho_tocta test. As I had to download it first to my windows laptop, I had to change the file ending to .txt that I can upload it here (the file filter did not allow .bak files).

test.tess.bak.txt

rquey commented 2 years ago

This will be fixed in 4.4.1, to appear soon.

Thanks again for your feedback. It's much appreciated.