dtcenter / MET

Model Evaluation Tools
https://dtcenter.org/community-code/model-evaluation-tools-met
Apache License 2.0
74 stars 22 forks source link

Bug: MET beta4 fails to build with Intel LLVM #2894

Closed HathewayWill closed 1 month ago

HathewayWill commented 1 month ago

Describe the Problem

MET v12.0.0 fails to build with intel llvm 2024. Appears to be several bugs related to this so I will summarize them.

  1. TIFF and SQLITE compile without arguments given.
  2. -j argument greater than 8 causes failure of GSL sometimes but not always. I have sucessfully installed it with -j 32. May be my system bug
  3. MET v12.0.0 appears to configure but the make command has several errors involving unknown arguments.

Expected Behavior

  1. if arguments are not given the packages shouldn't install
  2. any -j command up to 64 threads should be able to install. (64 threads is assumping hpc level installation)
  3. MET v12.0.0 should install like current version with intel llvm compilers

Environment

Describe your runtime environment: 1. Linux Desktop 2. OS: (ubuntu 22.04.4 3. Software version number(s)

ifx (IFX) 2024.1.0 20240308
Copyright (C) 1985-2024 Intel Corporation. All rights reserved.
Intel(R) oneAPI DPC++/C++ Compiler 2024.1.0 (2024.1.0.20240308)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2024.1/bin/compiler
Configuration file: /opt/intel/oneapi/compiler/2024.1/bin/compiler/../icpx.cfg
Intel(R) oneAPI DPC++/C++ Compiler 2024.1.0 (2024.1.0.20240308)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2024.1/bin/compiler
Configuration file: /opt/intel/oneapi/compiler/2024.1/bin/compiler/../icx.cfg
echo $PATH
/opt/intel/oneapi/vtune/2024.1/bin64:/opt/intel/oneapi/mpi/2021.12/bin:/opt/intel/oneapi/mkl/2024.1/bin/:/opt/intel/oneapi/intelpython/python3.9/bin:/opt/intel/oneapi/intelpython/python3.9/condabin:/opt/intel/oneapi/dpcpp-ct/2024.1/bin:/opt/intel/oneapi/dev-utilities/2024.1/bin:/opt/intel/oneapi/debugger/2024.1/opt/debugger/bin:/opt/intel/oneapi/compiler/2024.1/opt/oclfpga/bin:/opt/intel/oneapi/compiler/2024.1/bin:/opt/intel/oneapi/advisor/2024.1/bin64:/home/workhorse/WRF_Intel/miniconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin
echo $LD_LIBRARY_PATH
/opt/intel/oneapi/tbb/2021.12/env/../lib/intel64/gcc4.8:/opt/intel/oneapi/mpi/2021.12/opt/mpi/libfabric/lib:/opt/intel/oneapi/mpi/2021.12/lib:/opt/intel/oneapi/mkl/2024.1/lib:/opt/intel/oneapi/ippcp/2021.11/lib/:/opt/intel/oneapi/ipp/2021.11/lib:/opt/intel/oneapi/dpl/2022.5/lib:/opt/intel/oneapi/dnnl/2024.1/lib:/opt/intel/oneapi/debugger/2024.1/opt/debugger/lib:/opt/intel/oneapi/dal/2024.2/lib:/opt/intel/oneapi/compiler/2024.1/opt/oclfpga/host/linux64/lib:/opt/intel/oneapi/compiler/2024.1/opt/compiler/lib:/opt/intel/oneapi/compiler/2024.1/lib:/opt/intel/oneapi/ccl/2021.12/lib/

To Reproduce

see attached log files and scripts.

METv12_beta4.zip

tar files: https://dtcenter.ucar.edu/dfiles/code/METplus/MET/installation/tar_files.met-base-v3.2.tgz

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

Labels

Milestone and Projects

Define Related Issue(s)

Consider the impact to the other METplus components.

Bugfix Checklist

See the METplus Workflow for details.

jprestop commented 1 month ago

@HathewayWill

Regarding:

TIFF and SQLITE compile without arguments given.

If TIFF is not defined in the environment file with TIFF_INCLUDE_DIR and TIFF_LIB_DIR, its compilation is enabled. If SQLITE is not defined in the environment file with SQLITE_INCLUDE_DIR and SQLITE_LIB_DIR, its compilation is enabled.

if arguments are not given the packages shouldn't install

This is the opposite behavior of the script. If an argument is given (i.e. the location for a package is given), the packages are not installed. If an argument is not given (i.e. the location for a package is not given), the packages are installed.

-j argument greater than 8 causes failure of GSL sometimes but not always. I have successfully installed it with -j 32. May be my system bug

This is not under our control. We suggest running with "-j 5".

any -j command up to 64 threads should be able to install. (64 threads is assumping hpc level installation)

We do not support how the 3rd party libraries operate. We suggest running with "-j 5". I install on many HPCs with "-j 5". Using a value of greater than 5 has caused failures for 3rd party library installations on some HPCs.

MET v12.0.0 appears to configure but the make command has several errors involving unknown arguments.

In this case, setting MET_PYTHON_CC to "$(python3-config --cflags --embed)" gives too much information. Please set MET_PYTHON_CC to simply be "-I ${MET_PYTHON}/include/python${PYTHON_VERSION_COMBINED}" and try recompiling. This will get rid of many arguments and will likely resolve the errors involving the unknown arguments.

MET v12.0.0 should install like current version with intel llvm compilers

METv11.1.0 does not officially support the Intel oneAPI llvm based compiler. Support was added in MET-12.0.0-beta2 with this issue.

HathewayWill commented 1 month ago

If TIFF is not defined in the environment file with TIFF_INCLUDE_DIR and TIFF_LIB_DIR, its compilation is enabled. If SQLITE is not defined in the environment file with SQLITE_INCLUDE_DIR and SQLITE_LIB_DIR, its compilation is enabled.

I didn't realize they were new and were automatically compiled.

This is not under our control. We suggest running with "-j 5".

Will test with different -j numbers and provide feedback.

"-I ${MET_PYTHON}/include/python${PYTHON_VERSION_COMBINED}"

Will test this and see what happens

METv11.1.0 does not officially support the Intel oneAPI llvm based compiler. Support was added in MET-12.0.0-beta2 with https://github.com/dtcenter/MET/issues/2611.

I was mistaken

HathewayWill commented 1 month ago

@jprestop

With the change in the MET_PYTHON_CC the intel LLVM compilers sucessfully installed MET and METplus. 05/20 01:24:21.803Z metplus.dac95d63 INFO: METplus has successfully finished running as user workhorse(1000).

Will test with different -j numbers and provide feedback. I was able to get met and metplus to install up to -j 16 anything above will fail gsl. config.log configure.log met.make.log met.make_test.log compile_MET_all.log

jprestop commented 1 month ago

Great, @HathewayWill! Thanks for letting us know that you got a successful compilation and were able to tun with -j equal to 16. I will go ahead and close this issue.