metno / emep-ctm

Open Source EMEP/MSC-W model
GNU General Public License v3.0
27 stars 18 forks source link

Compile rv4.34 with openmpi in a docker container #76

Closed lucamarletta closed 3 years ago

lucamarletta commented 3 years ago

Hi All, I'm trying to build this model with openmpi in debian jessie.

With APT debian install openmpi in

/usr/lib/openmpi/lib
/usr/lib/openmpi/include

I'm working in a Dockerfile. First from my emep34 source dir, I did a simple make and mpi.mod was not found. So I symlink mpi.mod from the /usr/lib/openmpi/lib/mpi.mod and all files from /usr/lib/openmpi/include into my source emep34 dir.

RUN cd ${HOME}/emep34 \
     && ln -s /usr/lib/openmpi/lib/mpi.mod mpi.mod \
     && ln -s /usr/lib/openmpi/include/* ./ \
     && make

And this way I was able to compile the model but before finished I got these errors:

......
Pollen_mod.f90:(.text+0x320c): undefined reference to `mpi_bcast_'
Pollen_mod.f90:(.text+0x3253): undefined reference to `mpi_bcast_'
collect2: error: ld returned 1 exit status
make: *** [emepctm_rv4_34] Error 1
Makefile:53: recipe for target 'emepctm_rv4_34' failed

I've not experience with fortran, so I didn't modify the Makefile and I guess that there are some ENV variables to set.

Could you suggest the right way to include mpi and successfully compile the model 34?

avaldebe commented 3 years ago

Hi @lucamarletta

On Ubuntu 18.04 you need the libopenmpi-dev and libnetcdf-dev packages. On Debian Jessie is an older version, so you need more packages. My guess is your system needs the following packages libopenmpi1.6, libopenmpi-dev, libnetcdf-dev and netcdf-bin

Instead of guessing, it would be easier for me to see your dockerfile and try to reproduce your build.

If you did can not show me your dockerfile or you are using a private base image, I would need to know what openmpi/netcdf you have installed and the output of the following commands:

# Fortran flags needed for NetCDF support
nc-config --fflags

# Fortran libraries needed to link the program
nc-config --flibs

I do not have much experience with Docker, but I'll be happy to help you.

Cheers, Álvaro

lucamarletta commented 3 years ago

Hi Alvaro, thanks a lot for your quick reply. here my Dockerfile commented there is also a tentative to compile openmpi from source but didn't help, same error.

No problem for sharing it.

you can see the versions installed here

I did also some tentative with some changes in Makefile but nothing good so consider having the standard Makefile

FROM debian:jessie 
LABEL project="IAM-SUL" \
      author="Luca Marletta" \
      image_name="" \
      version="1.0" \
      released="2020-09-03" \
      software_versions="OpenMPI 3.0 NCDF 4 Fortran 95 Model emep 34" \
      description="EMEP air quality model, for training and validation of the SHERPA simplified model"

ENV DEBIAN_FRONTEND=noninteractive
ENV TERM xterm
ENV DISPLAY :1.0
ENV LC_ALL C.UTF-8

RUN apt-get update && apt-get -yq install gcc gfortran g++\
                      build-essential \
                      tar \
                      bzip2 \
                      m4 \
                      zlib1g-dev \
                      curl \
                      sudo \
                      unzip \
                      vim \
                      wget \
                      libc6

RUN apt-get install -y apt-utils libnetcdf-dev openssh-client openssh-server 

RUN apt-get install -y openmpi-bin openmpi-common libopenmpi1.6 libopenmpi1.6-dbg libopenmpi-dev

COPY hdf5-1.10.3.tar.bz2 hdf5-1.10.3.tar.bz2
COPY netcdf-c-4.6.2.tar.gz netcdf-c-4.6.2.tar.gz
COPY netcdf-fortran-4.4.4.tar.gz netcdf-fortran-4.4.4.tar.gz
#COPY openmpi-v4.0/ /tmp/openmpi-v4.0/

#Build HDF5
RUN tar xjvf hdf5-1.10.3.tar.bz2 && \
    cd hdf5-1.10.3 && \
    CC=mpicc ./configure --enable-parallel --prefix=/usr/local && \
    make -j4 && \
    make install && \
    cd .. && \
    rm -rf /hdf5-1.10.3 /hdf5-1.10.3.tar.bz2 

RUN apt-get install -y libcurl3 libcurl4-gnutls-dev 
#Build netcdf
RUN tar xzvf netcdf-c-4.6.2.tar.gz && \
    cd netcdf-c-4.6.2 && \
    ./configure --prefix=/usr \ 
                CC=mpicc \
                LDFLAGS=-L/usr/local/lib \
                CFLAGS=-I/usr/local/include && \
    make -j4 && \
    make install && \
    cd .. && \
rm -rf netcdf-c-4.6.2 netcdf-c-4.6.2.tar.gz

RUN tar xzvf netcdf-fortran-4.4.4.tar.gz && \
    cd netcdf-fortran-4.4.4 && \
    ./configure --prefix=/usr/local CC=/usr/bin/mpicc FC=/usr/bin/gfortran LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include && \
    make && make install && \
    cd .. && \
    rm -rf netcdf-fortran-4.4.4 netcdf-fortran-4.4.4.tar.gz

## Compile and install MPI
#RUN cd /tmp/openmpi-v4.0/build \
#    && ../configure \
#    && make -j 10 all \
#    && make install

ENV HOME=/home/iamsulproc
RUN export uid=10000 gid=1000 \
    && mkdir -p ${HOME} \
    && echo "iamsulproc:x:${uid}:${gid}:iamsulproc,,:${HOME}:/bin/bash" >> /etc/passwd \
    && echo "iamsulproc:x:${uid}:" >> /etc/group \
    && chown ${uid}:${gid} -R ${HOME} \
    && echo "iamsulproc ALL=(ALL) NOPASSWD: ALL" > /etc/sudoers.d/iamsulproc \
    && chmod 0440 /etc/sudoers.d/iamsulproc

#RUN apt-get update
RUN apt-get install -y --no-install-recommends \
    python-crypto \
    python-dateutil \
    python-dev \
    python-lxml \
    python-numpy \
    python-openssl \
    python-pip \
    python-psycopg2 \
    python-scipy \
    python-urllib3 \
    python-colorama \
    python-distlib \
    python-html5lib \
    python-pkg-resources \
    python-requests \
    python-scipy \
    python-setuptools \
    python-six \
    python-wheel \
    python-pip-whl \
    swig 

COPY  emep-ctm-rv4_34/ ${HOME}/emep34/

RUN cd ${HOME}/emep34 \
    && ln -s /usr/lib/openmpi/lib/mpi.mod mpi.mod \
    && ln -s /usr/lib/openmpi/include/* ./ \
    && make

USER iamsulproc
CMD /bin/bash
avaldebe commented 3 years ago

Hi @lucamarletta

I'm having problems building tour image. Building netcdf-fortran

RUN tar xzvf netcdf-fortran-4.4.4.tar.gz && \
    cd netcdf-fortran-4.4.4 && \
    ./configure --prefix=/usr/local CC=/usr/bin/mpicc FC=/usr/bin/gfortran LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include && \
    make && make install && \
    cd .. && \
    rm -rf netcdf-fortran-4.4.4 netcdf-fortran-4.4.4.tar.gz

gives me the following error

checking size of off_t... configure: error: in `/netcdf-fortran-4.4.4':
configure: error: cannot compute sizeof (off_t)
See `config.log' for more details
The command '/bin/sh -c tar xzvf netcdf-fortran-4.4.4.tar.gz &&     cd netcdf-fortran-4.4.4 &&     ./configure --prefix=/usr/local CC=/usr/bin/mpicc FC=/usr/bin/gfortran LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include &&     make && make install &&     cd .. &&     rm -rf netcdf-fortran-4.4.4 netcdf-fortran-4.4.4.tar.gz' returned a non-zero code: 77
The terminal process "/bin/bash '-c', 'docker build --pull --rm -f "Dockerfile" -t emepdocker:latest "."'" terminated with exit code: 77.
lucamarletta commented 3 years ago

Here there is a link https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg11941.html

But this is quite strange because if you used the same source code

and the same Dockerfile how could it happen?

Do you want I send to you the above tar?

avaldebe commented 3 years ago

I got the source code from

https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.3/src/hdf5-1.10.3.tar.bz2 ftp://ftp.unidata.ucar.edu/pub/netcdf/netcdf-c-4.6.2.tar.gz ftp://ftp.unidata.ucar.edu/pub/netcdf/netcdf-fortran-4.4.4.tar.gz

The docker file is about the same, I updated the LABEL and removed the python packages at the of the dockerfile.

lucamarletta commented 3 years ago

Sorry but I have no idea for now I checked right now and in my case is compiled and in the docker level cache. I'll try again removing the cache but result will be the same...

lucamarletta commented 3 years ago

I repeat the build but no problem for me.

Have you checked the file permissions? Try to untar the file into the dir before the build and change the COPY netcdf-fortran-4.4.4/ netcdf-fortran-4.4.4/ It's just an idea..

avaldebe commented 3 years ago

looks like I found a few issues on the dockerfile

You have 2 copies of the netcdf libraries. One from the package repository libnetcdf-dev and the one compiled from netcdf-c-4.6.2.tar.gz

I removed the apt-get install libnetcdf-dev and installed netcdf-c-4.6.2.tar.gz to /usr/local in order to avoid further confusion.

Now, the relevant section of the dockerfile is

## Build netcdf
COPY netcdf-c-4.6.2.tar.gz netcdf-c-4.6.2.tar.gz
RUN tar xzvf netcdf-c-4.6.2.tar.gz && \
    cd netcdf-c-4.6.2 && \
    ./configure --prefix=/usr/local CC=mpicc LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include && \
    make -j4 && make install && \
    cd .. && \
    rm -rf netcdf-c-4.6.2 netcdf-c-4.6.2.tar.gz

COPY netcdf-fortran-4.4.4.tar.gz netcdf-fortran-4.4.4.tar.gz
RUN tar xzvf netcdf-fortran-4.4.4.tar.gz && \
    cd netcdf-fortran-4.4.4 && \
    ./configure --prefix=/usr/local CC=mpicc FC=gfortran LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include LD_LIBRARY_PATH=/usr/local/lib && \
    make -j4 && make install && \
    cd .. && \
    rm -rf netcdf-fortran-4.4.4 netcdf-fortran-4.4.4.tar.gz

Note the LD_LIBRARY_PATH=/usr/local/lib on the netcdf-fortran configuration. That is to find the hdf5 and netcdf-c instaled on the previous stages.

That takes care of the libraries. I'll work now on building the model

avaldebe commented 3 years ago

I did not have problems to find any of the libraries, so building the model was relatively straightforward

All what I needed to do was to comment the ifort configuration on the Makefile, as follows

 # Intel ifort compiler (comment out if gfortran used)
-F90FLAGS = -shared-intel -r8 -recursive -O2
+#F90FLAGS = -shared-intel -r8 -recursive -O2

Please let me know if this solves your problem

lucamarletta commented 3 years ago

Alvaro, thanks a lot.

It works fine now, I don't know exactly which changes make the difference but all together were successfully.

I appreciated your effort and really quick support.

Luca

avaldebe commented 3 years ago

Glad to be of help