wrf-model / WRF

The official repository for the Weather Research and Forecasting (WRF) model
Other
1.18k stars 658 forks source link

WRFDA PLUS V4.6.0 fails with intel llvm compilers. #2048

Closed HathewayWill closed 3 weeks ago

HathewayWill commented 1 month ago

@islas @mgduda

Describe the bug During the installation of WRFPLUS v4.6.0 a massive memory usage occurs that happens only with the intel llvm compilers and not with GNU.

I have attached relevant log files.

configure.log configure.wrf.txt

wrfplus1.compile.log

To Reproduce

cd "${WRF_FOLDER}"/Downloads
    mkdir "${WRF_FOLDER}"/WRFPLUS
    tar -xvzf WRF-${WRF_VERSION}.tar.gz -C "${WRF_FOLDER}"/WRFPLUS

    # If statment for changing folder name
    if [ -d ""${WRF_FOLDER}"/WRFPLUS/WRF" ]; then
        mv -f "${WRF_FOLDER}"/WRFPLUS/WRF "${WRF_FOLDER}"/WRFPLUS/WRFV${WRF_VERSION}
    fi

    cd "${WRF_FOLDER}"/WRFPLUS/WRFV${WRF_VERSION}
    mv * "${WRF_FOLDER}"/WRFPLUS
    cd "${WRF_FOLDER}"/WRFPLUS
    rm -rf WRFV${WRF_VERSION}/
    export NETCDF=$DIR/NETCDF
    export HDF5=$DIR/grib2
    export LD_LIBRARY_PATH=$DIR/grib2/lib:$LD_LIBRARY_PATH
    ./clean -a

    if [ ${auto_config} -eq 1 ]; then
        echo 40 | ./configure wrfplus 2>&1 | tee configure.log #Option 40 for intel and distribunted memory
    else
        ./configure wrfplus 2>&1 | tee configure.log #Option 40 for intel and distribunted memory
    fi
    echo " "

    sed -i '136s|mpif90 -f90=$(SFC)|mpiifx|g' "${WRF_FOLDER}"/WRFPLUS/configure.wrf
    sed -i '137s|mpicc -cc=$(SCC)|mpiicx|g' "${WRF_FOLDER}"/WRFPLUS/configure.wrf

    ./compile -j $CPU_HALF_EVEN wrfplus 2>&1 | tee wrfplus1.compile.log

Expected behavior No Memory leak

Screenshots

20240514_062846 20240514_062854

islas commented 1 month ago

I'd caution to label this as a "memory leak" as that has technical implications of the underlying root cause. It is a memory usage issue with large files when compiling with the new oneAPI compilers, however.

Consider compiling with a very limited number of threads (some files require upwards of 25GB individually) until a future fix is implemented.

HathewayWill commented 1 month ago

@islas

I will change the title of the issue to reflect that.

HathewayWill commented 1 month ago

I'd caution to label this as a "memory leak" as that has technical implications of the underlying root cause. It is a memory usage issue with large files when compiling with the new oneAPI compilers, however.

Consider compiling with a very limited number of threads (some files require upwards of 25GB individually) until a future fix is implemented.

I have tried with 4 threads and it still faults the machine. Should I attempt with 1 or 2 threads to help you with this issue @islas ?

islas commented 1 month ago

Definitely try just one thread. It will be much slower but it should compile at the very least.

HathewayWill commented 1 month ago

@islas

Definitely try just one thread. It will be much slower but it should compile at the very least.

Tried one thread and it worked, opened 4 posts on the wrf forum about this and my tests. I'll leave it to you to let me know if you need me to do more testing or create pull requests.

Here are the links:

https://forum.mmm.ucar.edu/threads/wrf-chem-w-kpp-wrf-chemda-v4-6-0-intel-llvm-compilers.17266/

https://forum.mmm.ucar.edu/threads/wrf-arw-v4-6-0-intel-llvm-compilers.17265/

https://forum.mmm.ucar.edu/threads/wrf-4dvar-v4-6-0-intel-llvm-compilers.17264/

https://forum.mmm.ucar.edu/threads/wrf-plus-4dvar-v4-6-0-intel-llvm-compilers.17263/

HathewayWill commented 3 weeks ago

Closing issue and opening again with new information