Open HathewayWill opened 10 months ago
ld: warning: directory not found for option '-L/Users/workhorse/WRF/MET-11.1.0/external_libs/lib/lib' ld: warning: directory not found for option '-L=/Users/workhorse/WRF/MET-11.1.0/external_libs/lib:/Users/workhorse/WRF/MET-11.1.0/external_libs/lib'
line 134/135 of the netcdf-cxx install log
Hi @HathewayWill. Could you please try changing the following line in compile_MET_all.sh from:
configure_lib_args="-lhdf5_hl -lhdf5 -lz"
to
configure_lib_args="-lnetcdf -lhdf5_hl -lhdf5 -lz"
and see if you get a successful compilation? Please let us know how it goes. Thanks!
Hi @HathewayWill. Could you please try changing the following line in compile_MET_all.sh from:
configure_lib_args="-lhdf5_hl -lhdf5 -lz"
to
configure_lib_args="-lnetcdf -lhdf5_hl -lhdf5 -lz"
and see if you get a successful compilation? Please let us know how it goes. Thanks!
sadly that didn't work, here are the log files again.
@jprestop
Found a solution to netcdfcxx but I don't know why.
Needed configure_lib_args="-lnetcdf -lm -lhdf5_hl -lhdf5 -lz"
but now we have a different error:
met.configure.log met.make.log met.make_install.log met.make_test.log
very confused @jprestop
Hi @HathewayWill.
I see in the met.make_test.log file:
*** Running Wavelet-Stat on APCP using a GRIB forecast and netCDF observation ***
../src/tools/core/wavelet_stat/wavelet_stat \
../data/sample_fcst/2005080700/wrfprs_ruc13_12.tm00_G212 \
../out/pcp_combine/sample_obs_2005080712V_12A.nc \
config/WaveletStatConfig_APCP_12 \
-outdir ../out/wavelet_stat -v 2
DEBUG 1: Start grid_stat by workhorse(501) at 2024-01-15 18:15:58Z cmd: ../src/tools/core/grid_stat/grid_stat ../out/pcp_combine/sample_fcs\
t_12L_2005080712V_12A.nc ../out/pcp_combine/sample_obs_2005080712V_12A.nc config/GridStatConfig_APCP_12 -outdir ../out/grid_stat -v 2
DEBUG 2: OMP_NUM_THREADS is not set. Defaulting to 1 thread. Recommend setting OMP_NUM_THREADS for faster runtimes.
DEBUG 2: OpenMP running on 1 thread(s).
DEBUG 1: Default Config File: /Users/workhorse/WRF/MET-11.1.0/share/met/config/GridStatConfig_default
DEBUG 1: User Config File: config/GridStatConfig_APCP_12
GSL_RNG_TYPE=mt19937
GSL_RNG_SEED=1
DEBUG 1: Forecast File: ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc
DEBUG 1: Observation File: ../out/pcp_combine/sample_obs_2005080712V_12A.nc
DEBUG 2: Processing masking regions.
terminate called after throwing an instance of 'netCDF::exceptions::NcNotNCF'
what(): NetCDF: Unknown file format
file: ncFile.cpp line:88
FATAL: Received Signal Abort. Exiting 6
make[1]: *** [grid_stat] Error 6
make[1]: *** Waiting for unfinished jobs....
Let's check on your NetCDF installations. Can you please tell me if all of the following files exist in your /Users/workhorse/WRF/MET-11.1.0/external_libs/include and /Users/workhorse/WRF/MET-11.1.0/external_libs/lib directories?
Files for NetCDF4 C: $MET_NETCDF/include/netcdf.h $MET_NETCDF/lib/libnetcdf.a $MET_NETCDF/lib/libnetcdf.so
Files for NetCDF4 C++: $MET_NETCDF/include/netcdf $MET_NETCDF/lib/libnetcdf_c++4.a $MET_NETCDF/lib/libnetcdf_c++4.so
@jprestop
They appear to be in there.
for netcdf-c++ I had to add -lm and -lnetcdf
@HathewayWill Ah yes, very confusing indeed. I think @georgemccabe figured out the problem. The compile_MET_all.sh script was running "make test" using MAKE_ARGS. Since some tests rely on the output of other tests to succeed, running "make test" in parallel won't work and explains the confusing information in the log file where it says it is running wavelet_stat, but then the log information refers to grid_stat. I have modified the compile_MET_all.sh script and have added "-lnetcdf -lm" to configure_lib_args for the compilation of NetCDF-CXX. Please download the new script and try again.
@jprestop @georgemccabe
So does make test need to have the make args removed for each one if running in parallel processing?
I'm running a WRF run right now but give me tonight and I'll test it later.
Sounds like that was the error which will make everyone happy that it's fixed using dtc-mosit and WRF-mosit
HI @HathewayWill
So does make test need to have the make args removed for each one if running in parallel processing?
I don't think I understand your question. I'm not sure what you mean by "each one".
To help clarify, we changed:
run_cmd "make ${MAKE_ARGS} test > $(pwd)/met.make_test.log 2>&1"
to
run_cmd "make test > $(pwd)/met.make_test.log 2>&1"
Maybe you mean - does ${MAKE_ARGS} need to be removed in calls to the external libraries' "make test" commands? If so, the answer, unfortunately, is I don't know if the external libraries "make test" commands rely on the output of other tests to succeed. All I can say is that I haven't experienced this problem previously in installations on various machines, so I think until we encounter a problem it is likely ok to leave as-is.
@jprestop
That was what I was getting at.
You answered my question about the removal.
Sorry for the confusion
@jprestop
Testing it now.
Was reading the new compile_MET script and I noticed that the make install for met doesn't have MAKE ARGS. Can met not be installed in parallel?
run_cmd "make install > met.make_install.log 2>&1"
@jprestop
tested it and it got worse.
Before it would fail at test now it fails at met.make
Here is the relevant log files. met.make.log configure.log
@jprestop @georgemccabe untitled folder.zip
different error now.
I'm wondering if these files are corrupted:
DEBUG 1: Forecast File: ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc DEBUG 1: Observation File: ../out/pcp_combine/sample_obs_2005080712V_12A.nc
Could you please send them to us following the directions here?
I'm wondering if these files are corrupted:
DEBUG 1: Forecast File: ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc DEBUG 1: Observation File: ../out/pcp_combine/sample_obs_2005080712V_12A.nc
Could you please send them to us following the directions here?
@jprestop I'm having issues with ubuntu and the ftp protocol.
You could also try to attached the files here, @HathewayWill.
You could also try to attached the files here, @HathewayWill. @jprestop @georgemccabe
not sure if you can get this otherwise email me directly
Hi @HathewayWill. Well, the NetCDF files do not seem to be corrupted. I copied them over and ran "ncdump" on them, and that worked fine. I also copied them to our project machine and ran the command that is causing you problems:
/nrit/ral/met-11.1.0/bin/grid_stat sample_fcst_12L_2005080712V_12A.nc sample_obs_2005080712V_12A.nc GridStatConfig_APCP_12 -outdir ./out/grid_stat -v 2
but I got a successful run. I did not have the problem you are experiencing:
*** Running Grid-Stat on APCP using netCDF input for both forecast and observation ***
../src/tools/core/grid_stat/grid_stat \
../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc \
../out/pcp_combine/sample_obs_2005080712V_12A.nc \
config/GridStatConfig_APCP_12 \
-outdir ../out/grid_stat -v 2
DEBUG 1: Start grid_stat by workhorse(501) at 2024-01-19 01:39:36Z cmd: ../src/tools/core/grid_stat/grid_stat ../out/pcp_combine/sample_fcst_12L_2005080\
712V_12A.nc ../out/pcp_combine/sample_obs_2005080712V_12A.nc config/GridStatConfig_APCP_12 -outdir ../out/grid_stat -v 2
DEBUG 2: OMP_NUM_THREADS is not set. Defaulting to 1 thread. Recommend setting OMP_NUM_THREADS for faster runtimes.
DEBUG 2: OpenMP running on 1 thread(s).
DEBUG 1: Default Config File: /Users/workhorse/WRF/MET-11.1.0/share/met/config/GridStatConfig_default
DEBUG 1: User Config File: config/GridStatConfig_APCP_12
GSL_RNG_TYPE=mt19937
GSL_RNG_SEED=1
DEBUG 1: Forecast File: ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc
DEBUG 1: Observation File: ../out/pcp_combine/sample_obs_2005080712V_12A.nc
DEBUG 2: Processing masking regions.
terminate called after throwing an instance of 'netCDF::exceptions::NcNotNCF'
what(): NetCDF: Unknown file format
file: ncFile.cpp line:88
FATAL: Received Signal Abort. Exiting 6
make[1]: *** [grid_stat] Error 6
make: *** [test] Error 2
Let's have you try running the same command outside of "make test". In the directory /Users/workhorse/WRF/MET-11.1.0/MET-11.1.0/scripts
, could you please run the following:
export TEST_OUT_DIR=/Users/workhorse/WRF/MET-11.1.0/MET-11.1.0
/Users/workhorse/WRF/MET-11.1.0/bin/grid_stat \
../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc \
../out/pcp_combine/sample_obs_2005080712V_12A.nc \
config/GridStatConfig_APCP_12 \
-outdir ../out/grid_stat -v 2
Please give that a try and post the output here. Please let me know if you have any questions.
@jprestop
I'm going to rebuild the mac and test it again, i can't even repeat the error on my own machine, now it is stopping before the previous error.
Do you have a mac machine available there?
Hi @HathewayWill. We have two developers who have successfully installed MET-11.1 on their Macs. One was using 13.6.2 (Ventura) and the other was using 12.6.2 (Monterey). Both compiled using the GNU compilers.
Morning @jprestop
Okay that's good to know. Let me rebuild my mac and double check on my side. I wonder if it's the shell script.
Are they using homebrew GNU compilers or something else?
@HathewayWill. They both used GNU compilers that were installed via MacPorts.
@jprestop
Might be the solution I'm using homebrew.
Can you ask them which gnu version of macports they are using
I would think that Homebrew and MacPorts would be similar, but it could be an issue. One of the developers was using MacPorts 12.3.0 and also used the compile_MET_all.sh script successfully.
@jprestop @georgemccabe
So for grins I tried to compile MET v11.0.0 using the same structure for installation as I did with V11.1.0
V11.0.0 didn't install so I am going to check my structure.
@jprestop @georgemccabe
So for grins I tried to compile MET v11.0.0 using the same structure for installation as I did with V11.1.0
V11.0.0 didn't install so I am going to check my structure.
Got it to work on MacOS Sonoma but not 100% on MacOS Ventura. Ventura the MET Tests have errors but metplus still runs sucessfully
The fixes worked that you implemented it on Sonoma but I have attached the logs for Ventura. MET_logs.zip
@jprestop @georgemccabe So for grins I tried to compile MET v11.0.0 using the same structure for installation as I did with V11.1.0 V11.0.0 didn't install so I am going to check my structure.
Got it to work on MacOS Sonoma but not 100% on MacOS Ventura. Ventura the MET Tests have errors but metplus still runs sucessfully
The fixes worked that you implemented it on Sonoma but I have attached the logs for Ventura. MET_logs.zip
And now Sonoma doesn't work. This is very confusing to me.
@jprestop @georgemccabe @JohnHalleyGotway
With you're permission i'm going to close this issue and open two different ones for the different mac operating systems. I think there is two different issues going on for each OS and I want to keep them seperate.
I will reference this issue though in the new ones if that is okay with you?
Hi @HathewayWill.
This situation is certainly very strange, particularly considering our developers have has successful compilations on various Mac OSs. This could be something in your environment, although it's not clear yet.
Even though MET's configure ran successfully for you, I do see the following error:
ld: Undefined symbols:
_H5Pset_all_coll_metadata_ops, referenced from:
_main in ccnwjJ9B.o
collect2: error: ld returned 1 exit status
configure:18015: $? = 1
The other developers did not receive that error. I have their config.log files and would like to step through to see the differences, but I haven't have had a chance to look into the above error or to compare the log files yet.
@jprestop
I will retest and see what I can find and attach log files here.
Replace italics below with details for this issue.
Describe the Problem
MET 11.1.0 fails to build NETCDF-CXX
Expected Behavior
MET would compile like 11.0.0
Environment
Describe your runtime environment: 1. Machine: Virtual Machine 2. OS: MacOS 13 *3. Software version number(s): 13.4 beta
To Reproduce
See attached zip file with logs and compile.sh script
MET_FAIL_MACOS.zip
Relevant Deadlines
List relevant project deadlines here or state NONE.
Funding Source
Define the source of funding and account keys here or state NONE.
Define the Metadata
Assignee
Labels
Milestone and Projects
Define Related Issue(s)
Consider the impact to the other METplus components.
Bugfix Checklist
See the METplus Workflow for details.
bugfix_<Issue Number>_main_<Version>_<Description>
bugfix <Issue Number> main_<Version> <Description>
bugfix_<Issue Number>_develop_<Description>
Pull request:bugfix <Issue Number> develop <Description>
Select: Reviewer(s) and Development issue Select: Milestone as the next official version Select: MET-X.Y.Z Development project for development toward the next official release