sourceryinstitute / OpenCoarrays

A parallel application binary interface for Fortran 2018 compilers.
http://www.opencoarrays.org
BSD 3-Clause "New" or "Revised" License
244 stars 58 forks source link

Regexp not found during unit tests #18

Closed milancurcic closed 8 years ago

milancurcic commented 8 years ago

Hi Damian et al.,

I built opencoarrays with gcc-5.2.0, openmpi-1.10.0 and cmake 3.4.0-rc2. Running ctest yields:

Test project /home/milan/opencoarrays/build
      Start  1: initialize_mpi
 1/22 Test  #1: initialize_mpi ...................   Passed    0.06 sec
      Start  2: register
 2/22 Test  #2: register .........................   Passed    0.05 sec
      Start  3: register_rename_me
 3/22 Test  #3: register_rename_me ...............   Passed    0.05 sec
      Start  4: register_rename_me_too
 4/22 Test  #4: register_rename_me_too ...........   Passed    0.05 sec
      Start  5: allocate_as_barrier
 5/22 Test  #5: allocate_as_barrier ..............   Passed    1.05 sec
      Start  6: allocate_as_barrier_proc
 6/22 Test  #6: allocate_as_barrier_proc .........   Passed    1.05 sec
      Start  7: get_array
 7/22 Test  #7: get_array ........................   Passed    0.11 sec
      Start  8: send_array
 8/22 Test  #8: send_array .......................   Passed    0.14 sec
      Start  9: get_with_offset_1d
 9/22 Test  #9: get_with_offset_1d ...............   Passed    0.05 sec
      Start 10: whole_get_array
10/22 Test #10: whole_get_array ..................   Passed    0.05 sec
      Start 11: strided_get
11/22 Test #11: strided_get ......................   Passed    0.05 sec
      Start 12: co_sum
12/22 Test #12: co_sum ...........................   Passed    0.07 sec
      Start 13: co_broadcast
13/22 Test #13: co_broadcast .....................   Passed    0.07 sec
      Start 14: co_min
14/22 Test #14: co_min ...........................   Passed    0.07 sec
      Start 15: co_max
15/22 Test #15: co_max ...........................   Passed    0.07 sec
      Start 16: syncall
16/22 Test #16: syncall ..........................   Passed    1.56 sec
      Start 17: syncimages
17/22 Test #17: syncimages .......................   Passed    0.56 sec
      Start 18: co_reduce
18/22 Test #18: co_reduce ........................***Failed  Required regular expression not found.Regex=[Test passed.
]  1.42 sec
      Start 19: hello_multiverse
19/22 Test #19: hello_multiverse .................   Passed    0.05 sec
      Start 20: coarray_burgers_pde
20/22 Test #20: coarray_burgers_pde ..............***Failed  Required regular expression not found.Regex=[Test passed.
]  0.03 sec
      Start 21: co_heat
21/22 Test #21: co_heat ..........................   Passed    3.57 sec
      Start 22: coarray_navier_stokes
22/22 Test #22: coarray_navier_stokes ............   Passed    4.25 sec

91% tests passed, 2 tests failed out of 22

Total Test time (real) =  14.43 sec

The following tests FAILED:
     18 - co_reduce (Failed)
     20 - coarray_burgers_pde (Failed)
Errors while running CTest

Please look at the messages for tests 18 and 20. This looks to me like the Fortran tests actually passed, but the testing utility cannot find some regexp library and reports the test as failed. Can you confirm this? Thanks!

rouson commented 8 years ago

co_reduce fails with gfortran 5.2.0 but success with 6.0.0. I should disable co_reduce support with 5.2.0 and disable test.

I'll investigate the reason for the Bugers solver failure. The Burgers solver should produce results similar to what is shown in Chapter 4 of my book.

D

Sent from my iPhone

On Oct 24, 2015, at 5:35 PM, Milan Curcic notifications@github.com wrote:

Hi Damian et al.,

I built opencoarrays with gcc-5.2.0, openmpi-1.10.0 and cmake 3.4.0-rc2. Running ctest yields:

Test project /home/milan/opencoarrays/build Start 1: initialize_mpi 1/22 Test #1: initialize_mpi ................... Passed 0.06 sec Start 2: register 2/22 Test #2: register ......................... Passed 0.05 sec Start 3: register_rename_me 3/22 Test #3: register_rename_me ............... Passed 0.05 sec Start 4: register_rename_me_too 4/22 Test #4: register_rename_me_too ........... Passed 0.05 sec Start 5: allocate_as_barrier 5/22 Test #5: allocate_as_barrier .............. Passed 1.05 sec Start 6: allocate_as_barrier_proc 6/22 Test #6: allocate_as_barrier_proc ......... Passed 1.05 sec Start 7: get_array 7/22 Test #7: get_array ........................ Passed 0.11 sec Start 8: send_array 8/22 Test #8: send_array ....................... Passed 0.14 sec Start 9: get_with_offset_1d 9/22 Test #9: get_with_offset_1d ............... Passed 0.05 sec Start 10: whole_get_array 10/22 Test #10: whole_get_array .................. Passed 0.05 sec Start 11: strided_get 11/22 Test #11: strided_get ...................... Passed 0.05 sec Start 12: co_sum 12/22 Test #12: co_sum ........................... Passed 0.07 sec Start 13: co_broadcast 13/22 Test #13: co_broadcast ..................... Passed 0.07 sec Start 14: co_min 14/22 Test #14: co_min ........................... Passed 0.07 sec Start 15: co_max 15/22 Test #15: co_max ........................... Passed 0.07 sec Start 16: syncall 16/22 Test #16: syncall .......................... Passed 1.56 sec Start 17: syncimages 17/22 Test #17: syncimages ....................... Passed 0.56 sec Start 18: co_reduce 18/22 Test #18: co_reduce ........................_Failed Required regular expression not found.Regex=[Test passed. ] 1.42 sec Start 19: hello_multiverse 19/22 Test #19: hello_multiverse ................. Passed 0.05 sec Start 20: coarray_burgers_pde 20/22 Test #20: coarray_burgers_pde .............._Failed Required regular expression not found.Regex=[Test passed. ] 0.03 sec Start 21: co_heat 21/22 Test #21: co_heat .......................... Passed 3.57 sec Start 22: coarray_navier_stokes 22/22 Test #22: coarray_navier_stokes ............ Passed 4.25 sec

91% tests passed, 2 tests failed out of 22

Total Test time (real) = 14.43 sec

The following tests FAILED: 18 - co_reduce (Failed) 20 - coarray_burgers_pde (Failed) Errors while running CTest Please look at the messages for tests 18 and 20. This looks to me like the Fortran tests actually passed, but the testing utility cannot find some regexp library and reports the test as failed. Can you confirm this? Thanks!

— Reply to this email directly or view it on GitHub.

milancurcic commented 8 years ago

Damian, thanks. I will proceed with my own experimenting with CAF with this current build and let you know if I encounter any other problems.

sourceryinstitute commented 8 years ago

I haven’t been able to reproduce the coarrayBurgers test failure with gfortran 5.2. It would be interesting to know if you see the same issue with the gcc trunk (pre-release 6.0.0), but I recognize that it could be burdensome to build GCC from source for this purpose alone. Nonetheless, if you do it, the OpenCoarrays build script might be useful: https://github.com/sourceryinstitute/opencoarrays/blob/master/install_prerequisites/buildgcc. Or you could install 6.0.0 via package management software such as MacPorts or Homebrew on OS X or apt-et/yum/aur on Linux.

Please also send me the output you obtain from running your copy of the coarrayBurgers solver. You should get a sine wave.


Damian Rouson, Ph.D., P.E. President, Sourcery Institute http://www.sourceryinstitute.org +1-510-600-2992 (mobile)

On Oct 24, 2015, at 7:00 PM, Milan Curcic notifications@github.com wrote:

Damian, thanks. I will proceed with my own experimenting with CAF with this current build and let you know if I encounter any other problems.

— Reply to this email directly or view it on GitHub https://github.com/sourceryinstitute/opencoarrays/issues/18#issuecomment-150882111.

milancurcic commented 8 years ago

OK, thanks, will do this by the end of the week and let you know what I find.

milancurcic commented 8 years ago

Hi Damian, I had the chance to look into this into more detail. I followed the breadcrumbs from the ctest utility to get to the actual test:

[milan@localhost build]$ grep coarray_burgers_pde CTestTestfile.cmake 
add_test(coarray_burgers_pde "/opt/openmpi/gcc-5.2.0/bin/mpiexec" "-np" "2" "/home/milan/opencoarrays/build/src/tests/integration/pde_solvers/coarrayBurgers/coarray_burgers_pde")
set_tests_properties(coarray_burgers_pde PROPERTIES  PASS_REGULAR_EXPRESSION "Test passed.")
[milan@localhost build]$ cd /home/milan/opencoarrays/build/src/tests/integration/pde_solvers/coarrayBurgers
[milan@localhost coarrayBurgers]$ /opt/openmpi/gcc-5.2.0/bin/mpiexec -np 2 coarray_burgers_pde 
coarray_burgers_pde: /lib64/libgfortran.so.3: version `GFORTRAN_1.6' not found (required by coarray_burgers_pde)
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[48785,1],0]
  Exit code:    1
--------------------------------------------------------------------------

It looks like my executable couldn't find what it wanted from the linked libraries. Running ldd on the executable reveals that it was linked to my system gcc libraries (v4.8.2) instead of the ones I provided during the opencoarrays build process (v5.2.0).

[milan@localhost coarrayBurgers]$ ldd coarray_burgers_pde 
./coarray_burgers_pde: /lib64/libgfortran.so.3: version `GFORTRAN_1.6' not found (required by ./coarray_burgers_pde)
    linux-vdso.so.1 =>  (0x00007ffe2abb2000)
    libmpi_usempif08.so.11 => /opt/openmpi/gcc-5.2.0/lib/libmpi_usempif08.so.11 (0x00007f7136d77000)
    libmpi_usempi_ignore_tkr.so.6 => /opt/openmpi/gcc-5.2.0/lib/libmpi_usempi_ignore_tkr.so.6 (0x00007f7136b70000)
    libmpi_mpifh.so.12 => /opt/openmpi/gcc-5.2.0/lib/libmpi_mpifh.so.12 (0x00007f7136919000)
    libmpi.so.12 => /opt/openmpi/gcc-5.2.0/lib/libmpi.so.12 (0x00007f713663c000)
    libgfortran.so.3 => /lib64/libgfortran.so.3 (0x00007f713631a000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f7136013000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f7135dfd000)
    libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00007f7135bc1000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f71359a4000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f71355e6000)
    libopen-rte.so.12 => /opt/openmpi/gcc-5.2.0/lib/libopen-rte.so.12 (0x00007f713536b000)
    libopen-pal.so.13 => /opt/openmpi/gcc-5.2.0/lib/libopen-pal.so.13 (0x00007f7135090000)
    libXNVCtrl.so.0 => /lib64/libXNVCtrl.so.0 (0x00007f7134e8a000)
    libXext.so.6 => /lib64/libXext.so.6 (0x00007f7134c78000)
    libX11.so.6 => /lib64/libX11.so.6 (0x00007f713493a000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00007f7134736000)
    librt.so.1 => /lib64/librt.so.1 (0x00007f713452e000)
    libutil.so.1 => /lib64/libutil.so.1 (0x00007f713432b000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f7136fa5000)
    libxcb.so.1 => /lib64/libxcb.so.1 (0x00007f713410a000)
    libXau.so.6 => /lib64/libXau.so.6 (0x00007f7133f06000)

Setting the correct environment when running the tests yields expected output:

[milan@localhost coarrayBurgers]$ env LD_LIBRARY_PATH=/opt/openmpi/gcc-5.2.0/lib:/opt/gcc-5.2.0/lib64 mpiexec -n 2 coarray_burgers_pde 
 Time =  0.60065795534754485     
 On image            1 u =  -3.6062644769485318E-016  0.55703975723770405        1.1137114029330639        1.6695096165091936        2.2228610006371610        2.7616422370249762        3.1602423341809680        2.5585980904689771     
 On image            2 u =   1.9805929082814393E-015  -2.5585980904689749       -3.1602423341809680       -2.7616422370249771       -2.2228610006371614       -1.6695096165091943       -1.1137114029330646      -0.55703975723770482     
 Test passed.

This resolves the issue. However I do remain with a question why did the executable get linked with the older library files. I did not specify them explicitly during the opencoarrays build process. Let me know if you have any ideas.

rouson commented 8 years ago

I wonder if it's a problem in the caf compiler wrapper. That script gets generated our the OpenCoarrays CMake scripts. At OpenCoarrays gets built, library paths are hardwired into the script. For example, see lines 90 and 111 in the following file, which becomes the tail of the caf script:

https://github.com/sourceryinstitute/opencoarrays/blob/branch-1.0.0/src/extensions/caf-foot

The variables caf_lib_dir, caf_mod_dir, link_args get written into the script at OpenCoarrays build time and never changed. This means there need to be different versions of caf for each compiler with which it will be used. I usually install each version of OpenCoarrays in a path such as

/opt/opencoarrays/1.1.3/gnu/5.2.0/

to indicate that this is OpenCoarrays version 1.1.3 (the latest version) built by GCC 5.2.0. Then I modify my PATH to make sure I get the version of caf that corresponds to the compiler version that I'm using. For example, if I'm using gfortran 5.2.0, then I issue the following command:

export PATH=/opt/opencoarrays/1.1.3/gnu/5.2.0/bin:$PATH

Possibly I should edit the caf script to allow the user to override the aforementioned variables by setting environment variables. If you think that would help, it would be great if you could test it by editing the caf script, testing your edits, and submitting a pull request.

szaghi commented 8 years ago

Hi all, I am off topic, but for handling different compilers environments I recently switched ti desk, see here https://github.com/jamesob/desk

In my fork I show how to use it as lightweigth alternative to module package.

P.S. I am coming back.

milancurcic commented 8 years ago

@rouson OK, I will try this and let you know.

@szaghi Cool, I saw desk the other day but didn't have chance to try it yet!

zbeekman commented 8 years ago

One thing I would like to do is examine the caf wrapper scripts, and how they are written at configure/build time... The source of this problem still seems to be unresolved, so I'm not sure if it would be worth reopening or not.

rouson commented 8 years ago

I'll be glad to walk you through what's there. I just added you to the calendar appointment I have for a weekly 10 AM Tuesday teleconference. Hopefully you received an invitation.

zbeekman commented 8 years ago

I hope you guys don't mind I'm adding some labels and assignees to closed (and some open) issues. Later on this can facilitate automatic post-hoc CHANGELOG generation and more useful issue tracking and sorting.

rouson commented 8 years ago

Got it. Thanks for explaining. You can ignore the related question I just submitted on a different issue.