Yinan-Scott-Shi / fds-smv

Automatically exported from code.google.com/p/fds-smv
0 stars 0 forks source link

Dynamic Link Error with OpenMPI on MacPro FDS 5.3 #633

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Please complete the following lines...

Application Version: 5.3.0
SVN Revision Number:
Compile Date:
Operating System:10.5.6

Describe details of the issue below: See Terminal Output text below.

Last login: Tue Jan 27 22:13:49 on console
SuperNova:~ jbrooker$ cd /Volumes/SuperNovaHD2/Data/Orion/RevDMac
SuperNova:RevDMac jbrooker$ mpiexec -np 4 
/Applications/NIST/FDS/fds_5.3.0_mpi_osx_64 
/Volumes/SuperNovaHD2/Data/Orion/RevDMac/RevD_test2.fds
dyld: lazy symbol binding failed: Symbol not found: __intel_fast_memcpy
  Referenced from: /usr/local/lib/libmpi.0.dylib
  Expected in: flat namespace

dyld: Symbol not found: __intel_fast_memcpy
  Referenced from: /usr/local/lib/libmpi.0.dylib
  Expected in: flat namespace

dyld: lazy symbol binding failed: Symbol not found: __intel_fast_memcpy
  Referenced from: /usr/local/lib/libmpi.0.dylib
  Expected in: flat namespace

dyld: Symbol not found: __intel_fast_memcpy
  Referenced from: /usr/local/lib/libmpi.0.dylib
  Expected in: flat namespace

dyld: lazy symbol binding failed: Symbol not found: __intel_fast_memcpy
  Referenced from: /usr/local/lib/libmpi.0.dylib
  Expected in: flat namespace

dyld: Symbol not found: __intel_fast_memcpy
  Referenced from: /usr/local/lib/libmpi.0.dylib
  Expected in: flat namespace

dyld: lazy symbol binding failed: Symbol not found: __intel_fast_memcpy
  Referenced from: /usr/local/lib/libmpi.0.dylib
  Expected in: flat namespace

dyld: Symbol not found: __intel_fast_memcpy
  Referenced from: /usr/local/lib/libmpi.0.dylib
  Expected in: flat namespace

[SuperNova.local:12324] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c 
at line 275
[SuperNova.local:12324] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
pls_rsh_module.c at line 1158
[SuperNova.local:12324] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at 
line 90
mpiexec noticed that job rank 1 with PID 12328 on node SuperNova.local exited 
on signal 5 
(Trace/BPT trap). 
1 additional process aborted (not shown)
[SuperNova.local:12324] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c 
at line 188
[SuperNova.local:12324] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
pls_rsh_module.c at line 1190
--------------------------------------------------------------------------
mpiexec was unable to cleanly terminate the daemons for this job. Returned 
value Timeout 
instead of ORTE_SUCCESS.
--------------------------------------------------------------------------
SuperNova:RevDMac jbrooker$ 

Original issue reported on code.google.com by john.e.b...@nasa.gov on 5 Feb 2009 at 4:56

GoogleCodeExporter commented 9 years ago

Original comment by bryan%so...@gtempaccount.com on 5 Feb 2009 at 4:57

GoogleCodeExporter commented 9 years ago
Do you know what version of the OpenMPI package you installed?

Intel support wanted to make sure we were using the same MPI libraries, can you 
check
and see what version is installed on your system now?

So far they did not give me any real information to go on, but I replied back to
their initial response and hopefully they will have enough info now to help 
resolve this.

Below is their response followed by my reply...

Dear Bryan,
Thank you for submitting the issue. Regarding this from your issue tracker:
>>>dyld: Symbol not found: __intel_fast_memcpy
Referenced from: /usr/local/lib/libmpi.0.dylib

Perhaps the version of libmpi on the client is different than the version on the
build machine. Can you check that and let me know? This this Intel MPI on the 
client?
What MPI is being used on the build machine?

On the surface this doesn't look like a compiler error per se, since you have no
issue on the build machine. It appears that you are using m_cprof_p_11.0.056 on
Intel64 architecture. Has that compiler's runtime been distributed to the 
client,
since you are linking dynamically?

Perhaps a workaround is to build the application with -static-intel, but that's 
a
long shot.

Other than the above, I don't have further suggestions. I did a search of the 
problem
report database, and nothing like this has been reported.

Please give me some feedback.

Thank you,
Patrick
Intel Developer Support

**** My Reply ****

Thanks for the timely reply...

Both machines are using OpenMPI, but I am not sure if they are the exact same
version. I will look into that.

I am using -static-intel for the build. Here are the Fortran and C compiler 
flags I
am using. Maybe you will see if I am missing something.

intel_osx_mpi_64 : FFLAGS = -O3 -m64 -heap-arrays -axSSSE3 -static-intel
-L/opt/intel/Compiler/11.0/056/lib
intel_osx_mpi_64 : CFLAGS = -O3 -m64 -Dpp_noappend -Dpp_OSX

I think that I went through the process of helping this user install the runtime
libraries, but I was having a difficult time finding the install package today 
on the
Intel downloads site. Can you provide me with a path to get them?

I thought that by using -static-intel flag, the intel libraries should be 
linked into
the binary. Also, why is the mpi library calling __intel_fast_memcpy ? Where 
does
this function live? and what does flat namespace mean? Shouldn't this function 
be in
a particular library file? Or does flat namespace mean that the function should 
have
been built in to the binary because of the -static-intel flag?

Thank you for your help on this issue.
-Bryan

Original comment by bryan%so...@gtempaccount.com on 5 Feb 2009 at 10:16

GoogleCodeExporter commented 9 years ago
This is what I get from running ompi_info:

Last login: Thu Feb  5 11:20:03 on ttys000
/usr/local/openmpi/bin/ompi_info ; exit;
SuperNova:~ jbrooker$ /usr/local/openmpi/bin/ompi_info ; exit;
                Open MPI: 1.2.5
   Open MPI SVN revision: r16989
                Open RTE: 1.2.5
   Open RTE SVN revision: r16989
                    OPAL: 1.2.5
       OPAL SVN revision: r16989
                  Prefix: /usr/local/openmpi
 Configured architecture: i386-apple-darwin9.1.0
           Configured by: bwklein
           Configured on: Fri Jan 11 13:34:03 EST 2008
          Configure host: devi1.nist.gov
                Built by: bwklein
                Built on: Fri Jan 11 13:42:05 EST 2008
              Built host: devi1.nist.gov
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (single underscore)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/local/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/local/bin/gfortran
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: yes
 mpirun default --prefix: no
           MCA backtrace: darwin (MCA v1.0, API v1.0, Component v1.2.5)
              MCA memory: darwin (MCA v1.0, API v1.0, Component v1.2.5)
           MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.5)
               MCA timer: darwin (MCA v1.0, API v1.0, Component v1.2.5)
         MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.5)
         MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.5)
           MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
           MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
                MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.5)
                MCA coll: self (MCA v1.0, API v1.0, Component v1.2.5)
                MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.5)
                MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA io: romio (MCA v1.0, API v1.0, Component v1.2.5)
               MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.5)
               MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.5)
              MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.5)
                 MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.5)
                 MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
                MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.5)
              MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.5)
              MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.5)
              MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.5)
                  MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.5)
                 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                 MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA ras: xgrid (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.5)
               MCA rmaps: round_robin (MCA v1.0, API v1.3, Component v1.2.5)
                MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.5)
                MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.5)
                 MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA pls: xgrid (MCA v1.0, API v1.3, Component v1.2.5)
                 MCA sds: env (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA sds: singleton (MCA v1.0, API v1.0, Component v1.2.5)
logout

[Process completed]

Original comment by john.e.b...@nasa.gov on 7 Feb 2009 at 5:43

GoogleCodeExporter commented 9 years ago
What is the status of this thread?

Original comment by mcgra...@gmail.com on 13 Mar 2009 at 3:14

GoogleCodeExporter commented 9 years ago
Has this problem been resolved?

Original comment by mcgra...@gmail.com on 13 Apr 2009 at 8:36

GoogleCodeExporter commented 9 years ago
Please provide an updated binary (fds_5.3.1_mpi_osx64) and I will try again.

Original comment by john.e.b...@nasa.gov on 15 Apr 2009 at 2:33

GoogleCodeExporter commented 9 years ago
What is the status of this thread?

Original comment by mcgra...@gmail.com on 11 Jun 2009 at 4:06

GoogleCodeExporter commented 9 years ago
John -- were you ever able to get your case to run?

Original comment by mcgra...@gmail.com on 23 Jul 2009 at 7:44

GoogleCodeExporter commented 9 years ago
No I have not. To my knowledge, an fds_mpi_osx64 binary does not exist. Please
correct me if I am mistaken.

Original comment by john.e.b...@nasa.gov on 23 Jul 2009 at 8:34

GoogleCodeExporter commented 9 years ago
You may be right -- Bryan, is there such a thing?

Original comment by mcgra...@gmail.com on 23 Jul 2009 at 8:42

GoogleCodeExporter commented 9 years ago
bryan, it doesn't look like the 64 bit mpi libraries are installed on devi1.  
If they
are there needs to be a way in the makefile of linking to the right set (32 or 
64
bit) of libraries.  The link errors I am getting when trying to build 
fds_mpi_osx_64
on devi1 are consistent with linking to the 32 bit instead of 64 bit mpi 
libraries. 
Errors look similar to when I try and build on linux but link to the wrong 
libraries.

Original comment by gfor...@gmail.com on 23 Jul 2009 at 9:52

GoogleCodeExporter commented 9 years ago
I just built a version on devi1 without any link errors. I moved the dylib 
files so
that the compiler would be forced to build a fully static version of the 
binary. (Tip
from web)  I am uploading a test version to downloads now for John to try on his
machine.  The problem was that it would run on my machine, but was not running 
on his.

Original comment by bryan%so...@gtempaccount.com on 23 Jul 2009 at 10:01

GoogleCodeExporter commented 9 years ago
Please try the binary found at:
http://fds-smv.googlecode.com/files/fds_5.3.1_mpi_osx_64.zip

Original comment by bryan%so...@gtempaccount.com on 23 Jul 2009 at 10:10

GoogleCodeExporter commented 9 years ago
I've never used them but the Intel download site for the compilers has run-time
libraries available.  Maybe this we need to be giving these out.

Original comment by gfor...@gmail.com on 23 Jul 2009 at 11:36

GoogleCodeExporter commented 9 years ago
We tried that back in Nov. of last year.
ftp://ftp.nist.gov/pub/bfrl/bwklein/

Hopefully the file I put up there will solve the problem.

Original comment by bryan%so...@gtempaccount.com on 24 Jul 2009 at 1:53

GoogleCodeExporter commented 9 years ago
John, when you have a chance could you try the newly released version of FDS 
(5.4) 
on your Mac and let us know if it works.

Original comment by mcgra...@gmail.com on 3 Sep 2009 at 2:39

GoogleCodeExporter commented 9 years ago
Hi Byran, Glenn and Kevin,

I'm experiencing the same problem on our Mac Pro system (OSX 10.5.8) here at 
CESARE with your current 
5.4.3 64bit binary although the 5.3.1 binary which is linked above works with 
the provided 1.3.3_64 libraries 
(after a bit of fiddling which I've described below), but the 5.4.1 binary 
results in the following error: 

dyld: lazy symbol binding failed: Symbol not found: __intel_fast_memcpy
  Referenced from: /Applications/FDS/FDS5/bin/fds_mpi_osx_64
  Expected in: /Applications/FDS/openmpi-1.3.3_64/lib/libopen-pal.0.dylib

While attempting to fix this I also came across a couple of things which may be 
of interest to you. 

Firstly, I found that the DYLD_etc environment variable on it's own caused what 
I can only describe as a long 
winded segmentation fault with lots of messages beginning with "Sorry!" as 
openmpi can't access it's own 
help facility, so to fix this, you need to include the OPAL_PREFIX variable 
pointed to the openmpi-1.3.3_64/ 
directory to your profile and I added the LD_LIBRARY_PATH variable to 
openmpi-1.3.3_64/lib for good 
measure as well after consulting some online help.

This might just be related to our system, but I also discovered that in order 
to use the commands specific to 
the 1.3.3_64 (i.e. ompi_info or mpirun) you have to ./ them explicitly from the 
openmpi-1.3.3_64/bin 
directory, regardless of what environment variables are set up in your profile, 
for example ./ompi_info from 
openmpi-1.3.3_64/bin results in:

EWB4212-68:bin cesare$ ./ompi_info
                 Package: Open MPI gforney@devi1.nist.gov Distribution
                Open MPI: 1.3.3
   Open MPI SVN revision: r21666
   Open MPI release date: Jul 14, 2009
                Open RTE: 1.3.3
   Open RTE SVN revision: r21666
   Open RTE release date: Jul 14, 2009
                    OPAL: 1.3.3
       OPAL SVN revision: r21666
       OPAL release date: Jul 14, 2009
            Ident string: 1.3.3
                  Prefix: /Applications/FDS/openmpi-1.3.3_64/
 Configured architecture: i386-apple-darwin9.8.0
          Configure host: devi1.nist.gov
           Configured by: gforney
           Configured on: Thu Sep 10 20:32:04 EDT 2009
          Configure host: devi1.nist.gov
                Built by: gforney
                Built on: Thu Sep 10 20:50:30 EDT 2009
              Built host: devi1.nist.gov
(etc)

Whereas "ompi_info" on it's own gives:

EWB4212-68:bin cesare$ ompi_info
                Open MPI: 1.2.5
   Open MPI SVN revision: r16989
                Open RTE: 1.2.5
   Open RTE SVN revision: r16989
                    OPAL: 1.2.5
       OPAL SVN revision: r16989
                  Prefix: /Applications/FDS/openmpi-1.3.3_64/
 Configured architecture: i386-apple-darwin9.1.0
           Configured by: cesare
           Configured on: Tue Feb  5 14:37:21 EST 2008
          Configure host: EWB4212-68.local
                Built by: root
                Built on: Tue  5 Feb 2008 14:45:04 EST
              Built host: EWB4212-68.local
(etc)

I'm not sure if this may be an issue for others as well whose Mac's have 1.2.5 
pre-installed, but as I've 
replaced the 1.2.5 $PATH etc entries with new ones for 1.3.3_64, I am now 
unsure why or where these are 
being overridden.

So, in order to get the FDS 5.3 executable to work I had to use:
./mpirun -np X fds5.3_mpi_osx_64 input_file.fds 
from the openmpi-1.3.3_64/bin directory and where both the FDS5/bin directory 
and the input file locations 
are included in $PATH otherwise without the ./ I get the following:

mpirun -np 1 fds5.3_mpi_intel_osx_64 activate_vents.fds 
[EWB4212-68.local:82932] [NO-NAME] 
ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182
[EWB4212-68:82932] *** Process received signal ***
[EWB4212-68:82932] Signal: Segmentation fault (11)
[EWB4212-68:82932] Signal code: Address not mapped (1)
[EWB4212-68:82932] Failing at address: 0xfffffff0
[ 1] [0xbfffee18, 0xfffffff0] (-P-)
[ 2] (pthread_getspecific + 0x132) [0xbffff578, 0x900ab456] 
[ 3] (_dyld_get_image_header_containing_address + 0xc8) [0xbffff5a8, 
0x900e9702] 
[ 4] (opal_show_help + 0x3b3) [0xbffff618, 0x000c6303] 
[ 5] (orte_init_stage1 + 0xca) [0xbffff6f8, 0x00056eda] 
[ 6] (orte_system_init + 0x1e) [0xbffff718, 0x0005a74e] 
[ 7] (orte_init + 0x8d) [0xbffff748, 0x00056bfd] 
[ 8] (orterun + 0x181) [0xbffff7d8, 0x0000258d] 
[ 9] (main + 0x18) [0xbffff7f8, 0x0000240a] 
[10] (start + 0x36) [0xbffff814, 0x000023c6] 
[11] [0x00000000, 0x00000005] (FP-)
[EWB4212-68:82932] *** End of error message ***
Segmentation fault

Thanks for your help.

Regards,

Samara
CESARE, Victoria University, Melbourne, Australia

Original comment by samara.n...@vu.edu.au on 25 Jan 2010 at 4:19

GoogleCodeExporter commented 9 years ago
Samara, thanks for the info. You might want to consider buying the Intel 
Fortran 
compiler for Mac OSX. We spend a considerable amount of time and resources 
maintaining FDS and Smokeview on OSX, and yet it is a small fraction of our 
user 
base.

Original comment by mcgra...@gmail.com on 25 Jan 2010 at 1:02

GoogleCodeExporter commented 9 years ago
I've posted a new MAC bundle since the above comments were made in this issue.  
Do
these problems still exist?

Original comment by gfor...@gmail.com on 3 May 2010 at 12:41

GoogleCodeExporter commented 9 years ago
Hi Glenn,

The latest version (5.5/64) gives me the following error:

dyld: lazy symbol binding failed: Symbol not found: ___intel_sse2_strlen
  Referenced from: /Applications/FDS/FDS5/bin/fds5.5_mpi_osx_64
  Expected in: /Applications/FDS/openmpi-1.3.3_64/lib/libopen-pal.0.dylib

dyld: Symbol not found: ___intel_sse2_strlen
  Referenced from: /Applications/FDS/FDS5/bin/fds5.5_mpi_osx_64
  Expected in: /Applications/FDS/openmpi-1.3.3_64/lib/libopen-pal.0.dylib

Version 5.3 is the only one that has worked for us so far, 5.4 came up with a 
similar error to the above, but 
for __intel_fast_memcpy instead of ___intel_sse2_strlen, so perhaps something 
that changed from the 5.3 
bundle to the 5.5 bundle is causing the problem??

Samara 

Original comment by samara.n...@vu.edu.au on 3 May 2010 at 6:17

GoogleCodeExporter commented 9 years ago

Original comment by gfor...@gmail.com on 21 Jan 2011 at 7:20