NOAA-EMC / NCEPLIBS-bufr

The NCEPLIBS-bufr library contains routines and utilites for working with the WMO BUFR format.
Other
44 stars 19 forks source link

set default location of master BUFR tables during CMake #123

Closed jbathegit closed 3 years ago

jbathegit commented 3 years ago

Currently, the location of the master BUFR tables is hardcoded to '/gpfs/dell1/nco/ops/nwprod/decoders/decod_shared/fix' in subroutine bfrini. This path is where these tables live on the WCOSS, and which is why anyone who wants to point to them elsewhere needs to use subroutine mtinfo to do so. But what if we instead replaced that string with a macro, e.g. '@MASTER_TABLE_DIR@' and then renamed bfrini.f as bfrini.f.in, then used configure_file during CMake to replace the macro with ${CMAKE_INSTALL_PREFIX}/tables? That way, whatever gets specified as CMAKE_INSTALL_PREFIX when running cmake during the build process would become the default master tables directory for any and all users of that build, and those users would only need to call subroutine mtinfo if they wanted to point somewhere else.

jbathegit commented 3 years ago

There may be other/better ways to do this, but the possible solution mentioned above is the best idea I could come up with.

aerorahul commented 3 years ago

A macro is an improvement over hard-wiring in the source code. Note however that CMAKE_INSTALL_PREFIX is only a valid location after make install. So, if a user builds but does not install, then the tables will not be present in the location prescribed by CMAKE_INSTALL_PREFIX.

This also does not solve the problem when the installed package is relocated. This is still hard-wiring a path into the library, although it is not as egregious as a totally arbitrary path.

A macro such as this also does not give the user a flexibility of providing a path to user defined location of the tables.

Having relative paths is a solution, but that is also not without flaws. Using environment variables will either require a clean interface to the library and/or code changes in bfrini and elsewhere.

edwardhartnett commented 3 years ago

Are these tables installed with NCEPLIBS-bufr? Or is each installer supposed to provide their own tables?

jbathegit commented 3 years ago

@edwardhartnett, yes these tables are installed with the package, in the tables subdirectory of CMAKE_INSTALL_PREFIX. They are supplied as part of the distribution and are not provided by each user.

@aerorahul, the user always has flexibility by being able to call subroutine mtinfo to override the default path and point to any alternative path they choose. But there needs to be some default path set in the library itself (e.g. in subroutine bfrini) in case mtinfo is never called. The issue I see with using an environment variable is that every user would then need to make sure to have that variable set in their environment when running their application. That's the reason for having a default path set in the software itself, and I don't see that as necessarily being a problem from a design standpoint, b/c at the software level it's the same as initializing any other variable to a default value.

All I'm trying to do here is to stop having this path always be initialized to the current default location on the WCOSS. I realize CMAKE_INSTALL_PREFIX doesn't get created until make install is run, but if the user fails to do the install, then it seems to me that all bets should be off at that point. Or am I missing something here?

edwardhartnett commented 3 years ago

OK, if the files are installed with the package, is there a reason you can't use a relative path?

Please do not use or even consider an environment variable. That's not the way we do things.

If we need to set a value of a directory in the code, we can do that in the cmake stage by creating a .F90.in file, and then using variable substitution in that file to produce the .F90 file we need.

jbathegit commented 3 years ago

OK, if the files are installed with the package, is there a reason you can't use a relative path?

That's exactly what I'm doing at the moment. In all of the test codes a call to subroutine mtinfo is issued with path ../tables and it works fine. But in the library itself, what I'd prefer is to have the default directory path in subroutine bfrini be something other than the default path on the WCOSS, so that non-WCOSS users don't always have to use subroutine mtinfo to specify the location of the tables on their own systems. @rmclaren had asked about this as well, so my idea was to set the default path to @CMAKE_INSTALL_PREFIX@/tables, so that way the default path is always set to the tables subdirectory of wherever the person who builds the package picks as their install location. Subsequent users would still be able to choose an alternate path via a call to subroutine mtinfo, but at least this way there's more of a chance they won't have to, unless it ends up being moved from wherever the person who built the package originally installed it.

Please do not use or even consider an environment variable. That's not the way we do things.

If we need to set a value of a directory in the code, we can do that in the cmake stage by creating a .F90.in file, and then using variable substitution in that file to produce the .F90 file we need.

That's exactly the original idea I had, to replace the hardcoded path with a macro, e.g. @MASTER_TABLE_DIR@/tables and then rename bfrini.f as bfrini.f.in, then use configure_file during CMake to replace the macro with ${CMAKE_INSTALL_PREFIX}, or even just use @CMAKE_INSTALL_PREFIX@/tables directly . But @aerorahul didn't seem happy with that approach, re: our earlier discussion in #122, so I opened this up as a separate issue.

aerorahul commented 3 years ago

relative paths means one will have to link the tables into the directory relative to the relative path "hard-wired" in the code in the application. How is that different than hard-wiring absolute paths.

Using a macro means the code is no longer relocatable. How is that a good solution?

aerorahul commented 3 years ago

OK, if the files are installed with the package, is there a reason you can't use a relative path?

Please do not use or even consider an environment variable. That's not the way we do things.

If we need to set a value of a directory in the code, we can do that in the cmake stage by creating a .F90.in file, and then using variable substitution in that file to produce the .F90 file we need.

@edwardhartnett can you give an example where "we" do things with environment variables in our code in any other system?

edwardhartnett commented 3 years ago

@aerorahul I'm not sure what your question is.

We should not be using environment vars, other than the standard ones (CPPFLAGS, FCFLAGS, etc.) We don't want to build systems that depend on environment vars being set, like ESMFMKFILE, or NETCDF. It's not necessary and it's an extra burden on the end user for both documentation (i.e. they have to read it, which they don't usually), and implementation (i.e. they have to set it, which they should not have to).

aerorahul commented 3 years ago

The env. variable will not be used in the build system. Please give an example where an env. variable is used in the compiled code in any of our system.
ESMFMKFILE is not "our" code and it is not used at runtime. Neither is NETCDF environment variable.

edwardhartnett commented 3 years ago

I do not know of any such example. I just don't want to introduce any.

aerorahul commented 3 years ago

you don't want to introduce any because ...

jbathegit commented 3 years ago

So..., what's the endgame here? ;-) I'd like to push up a change to replace the WCOSS path in subroutine bfrini with a macro, as discussed earlier. There will still be a default path in this subroutine, but at least it won't be the WCOSS path but instead will be whatever is defined as ${CMAKE_INSTALL_PREFIX} by the person who builds the package. That way, it should greatly reduce the number of end users who need to call subroutine mtinfo.

So in my mind this would be an improvement over the current functionality, and I still think it's a better approach than the user always having to remember to set an environment variable whenever they run the software. The user can still use subroutine mtinfo if the tables end up getting moved from their original install location.

edwardhartnett commented 3 years ago

I think the proposed change is fine.

jbathegit commented 3 years ago

So in looking at this some more, I think the approach I'd like to take is to add a new optional variable MASTER_TABLE_DIR that can be set using -D when running the CMake command. This would allow the builder to install the tables somewhere other than CMAKE_INSTALL_PREFIX/tables, but I would still have it default to that value if cmake is run without -DMASTER_TABLE_DIR=<somepath>. So this would be a bit more flexible by allowing the builder to easily install the tables somewhere different from the compiled libraries and utilities, but it would still result in the exact same behavior as before if the new optional cmake variable is never specified. I would of course document this option in the README file as well.

However, and no matter whether I use MASTER_TABLE_DIR or CMAKE_INSTALL_PREFIX for this macro, this is still turning out to be a bit more complicated than I expected. The reason is because the macro is set within bfrini.f, which is a fixed-form Fortran routine, meaning a source code line can't go beyond column 72. But the value of the master table directory path (and therefore the value of the macro) can be up to 90 characters, and since I don't know the value of the macro ahead of time, I can't reliably break it across continuation lines in the bfrini.f source file.

One solution I tried that I thought would be fairly straightforward is to change the bfrini.f to bfrini.f90 so it's compiled as free-form Fortran and the 72-character limit goes away. That's simple enough, and I can also easily change all comments from 'C>' to '!>' as well as use an ampersand to continue to the next line (i.e. the free-form way) vs. use a character in column 6 of the continuation line (i.e. the fixed-form way). All good, except then it tries to compile the include file 'bufrlib.inc' as free-form as well, and I can't change that to be actual free-form without similarly changing all of the other 152 library routines which include that same file.

I could just stick the new macro in its own new .f90 subroutine and then call that subroutine from bfrini.f at run time to pass the value back to bfrini.f as a memory array. But in thinking about this further, I've actually wanted to get rid of the bufrlib.inc file for a while now, by moving the few remaining declarations in that file to modules for better overall encapsulation. Doing so would also resolve open issue #43 by making it moot, plus it would also resolve a problem I've been having trying to figure out how to get Doxygen to correctly interpret this bufrlib.inc file as Fortran (for some reason EXTENSION_MAPPING = inc=FortranFixed still doesn't do the trick!?)

Bottom line is that I could do this, but it would end up being a much bigger commit because I would have to make a simple modification to all 152 of the other routines which INCLUDE 'bufrlib.inc', so that they no longer include that file. Would that be acceptable to everyone, or should I just create a new .f90 subroutine to contain the new macro and then call that from bfrini.f at run time? I would especially like to hear if @jack-woollen has any thoughts on this? ;-)

jack-woollen commented 3 years ago

I'd get rid of the inc file.

jbathegit commented 3 years ago

A question for the CMake experts, including @kgerheiser as well:

As noted above, what I'd like to do is allow the builder to optionally specify -DMASTER_TABLE_DIR=<somepath> when calling cmake. If specified, then ${MASTER_TABLE_DIR} would have the value , but if unspecified it would default to a value of ${CMAKE_INSTALL_PREFIX}/tables which is the current practice.

I've tried the following in the main CMakeLists.txt:

# Set the directory where master tables will be installed.
if("${MASTER_TABLE_DIR}" STREQUAL "")
  set(MASTER_TABLE_DIR "${CMAKE_INSTALL_PREFIX}/tables")
endif()

and I've also tried:

# Set the directory where master tables will be installed.
if(NOT DEFINED MASTER_TABLE_DIR)
  set(MASTER_TABLE_DIR "${CMAKE_INSTALL_PREFIX}/tables")
endif()

but neither is doing the trick if the -DMASTER_TABLE_DIR=<somepath> option is unspecified when calling cmake. I would greatly appreciate any help to figure out what I'm missing or doing wrong here(?)

aerorahul commented 3 years ago

A default path variable will need to be defined and can be over-written by the user.

E.g.

set(MASTER_TABLE_DIR "${CMAKE_INSTALL_PREFIX/tables" CACHE STRING "Location of Master tables")

Add this line in the top-level CMakeLists.txt.

The user can then provide -DMASTER_TABLE_DIR=/path/to/install/tables as an argument to cmake.

jbathegit commented 3 years ago

Thanks Rahul - much appreciated!

jbathegit commented 3 years ago

Hey Rahul, as I was coding this up I almost forgot about the recent update you made to the cmake/PackageConfig.cmake.in file. As part of my forthcoming PR, would you like me to go ahead and change:

set(@PROJECT_NAME@_TABLES_DIR "${PACKAGE_PREFIX_DIR}/tables")

to:

set(@PROJECT_NAME@_TABLES_DIR "@MASTER_TABLE_DIR@")

to keep everything consistent? Or, if there's a reason not to do this (or a better way to do this?), please let me know.