firemodels / fds

Fire Dynamics Simulator
https://pages.nist.gov/fds-smv/
Other
648 stars 618 forks source link

Missing MPI libraries on Linux #2034

Closed gforney closed 9 years ago

gforney commented 9 years ago
This issue was originally reported via the Discussion Group by Robert Peart.

With FDS 6.0.1 on Linux, the MPI version gives the following error:

[robert@fds2 ~]$ fds_mpi
fds_mpi: error while loading shared libraries: libirng.so: cannot open shared object
file: No such file or directory

Original issue reported on code.google.com by koverholt on 2014-01-11 20:00:07

gforney commented 9 years ago
Glenn,

I wonder if this has to do with any compiler upgrades we've done recently? I reproduced
this problem on a clean Linux VM on my laptop. I do not have the file libirng.so on
my Linux system, and it seems to be part of the Intel compilers.

Original issue reported on code.google.com by koverholt on 2014-01-11 20:01:41

gforney commented 9 years ago
Since we are using -static-intel on all of the builds, and this libirng.so is an Intel
random number generator library, I wonder if it's a bug with the compilers and not
something that we can fix.

Original issue reported on code.google.com by koverholt on 2014-01-11 20:06:19

gforney commented 9 years ago
is this from an fds we distributed?  if not what version of the compilers were used
to build it?

Original issue reported on code.google.com by gforney on 2014-01-11 21:57:50

gforney commented 9 years ago
Yes, this is FDS 6.0.1 from the bundle.

Original issue reported on code.google.com by koverholt on 2014-01-11 22:03:24

gforney commented 9 years ago
so then what did you do to reproduce the problem - why does firebot work?

Original issue reported on code.google.com by gforney on 2014-01-11 22:05:14

gforney commented 9 years ago
Firebot works on blaze with the Intel compilers installed. I installed on my laptop
and it does not work, which does not have the Intel compilers installed.

Original issue reported on code.google.com by koverholt on 2014-01-11 22:13:23

gforney commented 9 years ago
Adding libirng.so to the /shared/openmpi_64/ directory produces these results:

[robert@fds2 ~]$ fds_mpi
[fds2:03023] mca: base: component_find: unable to open /shared/openmpi_64/lib/openmpi/mca_plm_tm:
libtorque.so.2: cannot open shared object file: No such file or directory (ignored)
[fds2:03023] mca: base: component_find: unable to open /shared/openmpi_64/lib/openmpi/mca_ras_tm:
libtorque.so.2: cannot open shared object file: No such file or directory (ignored)
Process   0 of   0 is running on fds2

Fire Dynamics Simulator

Version: FDS 6.0.1; MPI Enabled; OpenMP Disabled
SVN Revision Number: 17534
Compile Date: Tue, 26 Nov 2013

Consult FDS Users Guide Chapter, Running FDS, for further instructions.

Hit Enter to Escape...

[robert@fds2 ~]$ 

Original issue reported on code.google.com by rpeart@firecon.co.nz on 2014-01-12 17:49:25

gforney commented 9 years ago
Two files: mca_plm_tm.la and mca_plm_tm.so are present in the /shared/openmpi_64/lib/openmpi/
directory.
The file libtorque.so.2 does not appear to be anywhere on the file system:

[robert@fds2 ~]$ sudo find / -name libtorque.so.2
[sudo] password for robert: 
[robert@fds2 ~]$ 

Original issue reported on code.google.com by rpeart@firecon.co.nz on 2014-01-12 18:02:01

gforney commented 9 years ago
Two files: mca_ras_tm.la and mca_ras_tm.so are also present in the /shared/openmpi_64/lib/openmpi/
directory.

Original issue reported on code.google.com by rpeart@firecon.co.nz on 2014-01-12 18:07:27

gforney commented 9 years ago
Which distribution of Linux are you using?

Original issue reported on code.google.com by koverholt on 2014-01-12 18:32:38

gforney commented 9 years ago
CentOS 6.5

Original issue reported on code.google.com by rpeart@firecon.co.nz on 2014-01-12 18:41:19

gforney commented 9 years ago
Years ago I was using Ubuntu but at that time the FDS development team were using RHEL
so I changed to CentOS to make life easier

Original issue reported on code.google.com by rpeart@firecon.co.nz on 2014-01-12 18:43:57

gforney commented 9 years ago
We build the Linux executables and bundles on CentOS 5.5. On Ubuntu, I was able to install
openmpi-bin using apt-get, which installed torque and the other required libraries.
I wonder if you could do the same with yum on CentOS.

I believe that this is a bug in the way that the Intel compilers are staticly linking
libraries, we've recently upgraded the version of Intel compilers we are using. The
-static-intel option is supposed to take care of including these libraries in the executable.

So, I am going to see if this problem existed with FDS 6.0.0, test with CentOS 6.5,
and we also need to try a different version of the Intel compilers for Linux.

Original issue reported on code.google.com by koverholt on 2014-01-12 18:51:23

gforney commented 9 years ago
note, we did not building openmpi on blaze statically.

Original issue reported on code.google.com by gforney on 2014-01-12 19:53:08

gforney commented 9 years ago
Right, but even after downloading the /shared/openmpi_64 libraries, all of these issues
are coming up. I tried FDS 6.0.0, and the problem with libirng.so was not present,
but the warnings about libtorque.so.2 were still there.

Robert, even though the warning about libtorque comes up, it says (ignored). Are you
able to run an MPI case despite the warning?

Original issue reported on code.google.com by koverholt on 2014-01-12 20:20:55

gforney commented 9 years ago
Our solution is to include the missing libraries in the new FDS bundles. Please test
the 6.0.2 bundle and report back if that fixes things for you.

Original issue reported on code.google.com by koverholt on 2014-01-13 18:44:08

gforney commented 9 years ago
I successfully ran the Example room_fire.fds modified to use 2 meshes.  There were no
error messages.  Sorry about the delay getting back; I started back at work and there
was a major IT equipment rearrangement.

Original issue reported on code.google.com by rpeart@firecon.co.nz on 2014-01-14 19:46:12

gforney commented 9 years ago
Thanks for the reporting back. Marking as verified.

Original issue reported on code.google.com by koverholt on 2014-01-15 00:38:02

gforney commented 9 years ago
Is there any sense as to when 6.0.2 will be released with these missing libraries included,
or are there currently nightly builds that include these files? Thank you.

Original issue reported on code.google.com by sglink on 2014-01-29 18:08:10

gforney commented 9 years ago
We are currently planning for a minor release of FDS (6.1.0) due to other fixes, and
we are rerunning the entire validation suite as we do with minor releases. So, the
new release will be at least a few weeks out.

In the meantime, try placing the attached library file (libirng.so) in your ~/FDS/FDS6/bin/LIB64
(or ~/FDS/FDS6/bin/LIB32) folder and rerunning the MPI version of FDS. This library
is defined as "Redistributable" per the terms of the Intel commercial compiler license.

Original issue reported on code.google.com by koverholt on 2014-01-29 19:01:34


gforney commented 9 years ago
Thank you very much. fds_mpi v.6.0.1 now runs as expected.

Original issue reported on code.google.com by sglink on 2014-01-29 19:24:03