Closed edwardhartnett closed 1 year ago
@jbathegit when do we think this issue will be addressed?
@edwardhartnett I honestly have no idea. I haven't had any time to work on NCEPLIBS-bufr updates in the past couple of months, and I'm not sure how soon I'll be able to get back to it. I'd certainly welcome any help if someone else wants to work on this in the meantime.
@edwardhartnett @aerorahul @rmclaren @kgerheiser @jack-woollen
I've spent a bit of time looking into this. I've come up with a prototype that will work, but there are a couple of issues as noted below. For example, let's say I have an existing Fortran-77 library routine named writsa which has the following call signature:
call writsa( lunxx, lmsgt, msgt, msgl, xval, yval)
and where all of the arguments are currently implicitly typed (so the first four are integers and the last two are reals), and the third argument (msgt) is an array of size lmsgt. The goal here is to develop a single interface that can automatically handle any of 3 cases that might be passed in by a user:
So I could pick, say, the second case as the new default (meaning that, going forward, the only compiled version of the library would be equivalent to the existing _8 build) and have something like the following:
module modi_writsa
interface writsa
module procedure writsa_4, writsa_8, writsa_d
end interface
contains
subroutine writsa_4( lunxx, lmsgt, msgt, msgl, xval, yval )
implicit none
integer(kind=4), intent(in) :: lunxx
integer(kind=4), intent(in) :: lmsgt
integer(kind=4), intent(out) :: msgt(:)
integer(kind=4), intent(out) :: msgl
real(kind=4), intent(in) :: xval
real(kind=4), intent(out) :: yval
integer my_lunxx, my_lmsgt, my_msgl, ierr
integer, allocatable :: my_msgt(:)
real :: my_xval, my_yval
my_lunxx = lunxx
my_lmsgt = lmsgt
my_xval = xval
allocate(my_msgt(lmsgt),stat=ierr)
if (ierr .ne. 0 ) then
print *, 'allocate error in writsa_4'
return
endif
call writsa_f ( my_lunxx, my_lmsgt, my_msgt, my_msgl, my_xval, my_yval )
msgt(1:lmsgt) = my_msgt(1:lmsgt)
msgl = my_msgl
yval = my_yval
return
end subroutine
subroutine writsa_8( lunxx, lmsgt, msgt, msgl, xval, yval )
implicit none
integer(kind=8), intent(in) :: lunxx
integer(kind=8), intent(in) :: lmsgt
integer(kind=8), intent(out) :: msgt(:)
integer(kind=8), intent(out) :: msgl
real(kind=8), intent(in) :: xval
real(kind=8), intent(out) :: yval
call writsa_f ( lunxx, lmsgt, msgt, msgl, xval, yval )
return
end subroutine
subroutine writsa_d( lunxx, lmsgt, msgt, msgl, xval, yval )
implicit none
integer(kind=4), intent(in) :: lunxx
integer(kind=4), intent(in) :: lmsgt
integer(kind=4), intent(out) :: msgt(:)
integer(kind=4), intent(out) :: msgl
real(kind=8), intent(in) :: xval
real(kind=8), intent(out) :: yval
integer my_lunxx, my_lmsgt, my_msgl, ierr
integer, allocatable :: my_msgt(:)
my_lunxx = lunxx
my_lmsgt = lmsgt
allocate(my_msgt(lmsgt),stat=ierr)
if (ierr .ne. 0 ) then
print *, 'allocate error in writsa_d'
return
endif
call writsa_f ( my_lunxx, my_lmsgt, my_msgt, my_msgl, xval, yval )
msgt(1:lmsgt) = my_msgt(1:lmsgt)
msgl = my_msgl
return
end subroutine
end module
I would then rename the existing subroutine writsa as writsa_f, and that way users can still call the same name writsa as before, but now it will automatically re-direct them through the proper interface procedure (either writsa_4, writsa_8, or writsa_d, depending on how they have the variables declared within their own application code), and where each of these would then act as a conduit into the existing subroutine (now renamed as writsa_f) and where the call signature there is now clearly defined with all of the arguments being 8-byte integers and 8-byte reals as the default build of the library. I've tested this out and it works;
HOWEVER...
the main issue I see here is one of efficiency; specifically, in the writsa_4 and writsa_d cases, I have to copy all of the 4-byte values input by the user into corresponding 8-byte local variables, then call the single writsa_f routine with those 8-byte variables, then copy the output back into the original 4-byte values. This is even more complicated by the fact that one of the variables (msgt) is an array, which means I need to dynamically allocate the correct amount of space at run time in order to do the transfers. This seems like a lot of extra overhead, especially if we have to dynamically allocate temporary memory every time the subroutine is called. Is there maybe a cleaner way to do this that I'm missing here?
A secondary issue is that this, or whatever approach we end up using, will need to be replicated across hundreds of existing subroutines and functions in order to end up with a single build of the library that handles all of the existing _4, _8, and _d cases, plus all of the accompanying documentation, etc. Again, unless I'm missing something here that would make this process a lot more straightforward, this will be a lengthy task.
I'd greatly appreciate any comments, thoughts, advice, suggestions, etc. - thanks!
As you note there's no silver bullet when doing this in Fortran. Hopefully, a native method will be included in the next Fortran standard.
But for now another way to overload the method would be to overload writsa_f
directly eliminating the need to copy data and then call the function. To do this directly in Fortran you would have to copy and paste all the code in three different places.
A method I've used before to reduce the code duplication is to use the pre-processor to #include
the body of the function and the kinds:
#define INT_T int32
#define REAL_T real32
subroutine writsa_4( lunxx, lmsgt, msgt, msgl, xval, yval )
implicit none
integer(INT_T), intent(in) :: lunxx
integer(INT_T), intent(in) :: lmsgt
integer(INT_T), intent(out) :: msgt(:)
integer(INT_T), intent(out) :: msgl
real(REAL_T), intent(in) :: xval
real(REAL_T), intent(out) :: yval
#include "writsa_body.f"
end subroutine
#undef INT_T
#undef REAL_T
#define INT_T int32
#define REAL_T real64
subroutine writsa_d( lunxx, lmsgt, msgt, msgl, xval, yval )
implicit none
integer(INT_T), intent(in) :: lunxx
integer(INT_T), intent(in) :: lmsgt
integer(INT_T), intent(out) :: msgt(:)
integer(INT_T), intent(out) :: msgl
real(REAL_T), intent(in) :: xval
real(REAL_T), intent(out) :: yval
#include "writsa_body.f"
end subroutine
#undef INT_T
#undef REAL_T
But then you split up the function body and interface definition and rely on the pre-processor. Or it could be automated with the build system.
With bufr you would have to start at the lowest level functions and work your way up.
As @kgerheiser notes, Fortran does not support templates. Another possible approach would be to write the main code in a language that supports templates e.g. C++ and provide Fortran interfaces.
Ive never used it but fypp (https://fypp.readthedocs.io/en/stable/fypp.html) seems to be a popular way to address these kinds of issues. I noticed that some JCSDA code uses it. Its a little ugly in my opinion but anyways...
I'm not in favor of bringing in any new tools like fypp. In fact, I strongly oppose the idea.
There is always a strong urge to attempt to change Fortran in some way to overcome it's many shortcomings. (For example the UFS apparently has a home-rolled python to Fortran translator). In my experience, none of these band-aids on Fortran bring long-term benefit to the organization.
For example, we would not be very delighted to have to be using one of the many pre-processors I remember from Fortran 77 days, which gave F77 programmers new and wonderful innovations, now long-since forgotten. Will programmers at EMC in 25 years want to deal with fypp, our home-rolled python translator, or any other hack we make to Fortran to try and make it more easy to program?
Fortran is a long-term standard, but these other tools are local, temporary, and not really in it for the long haul. As the programmers that work on them retire or move on to new projects, these tools will be abandoned. But standard Fortran will march on, and NOAA will still want to run the millions of lines of existing Fortran code it has developed at great cost.
Future NOAA programmers, cursing our shortsightedness, will simply be removing these band-aids and returning to standard Fortran (or C/C++).
I would be in favor of moving the core of this library (in a gradual way) to C. I think C++ is not really needed, and is much harder for Fortran programmers to learn and consequently harder to staff.
(It should not be assumed that any Fortran programmer can instantly learn either C or C++. There is a steep learning curve, especially when it comes to pointers and object oriented programming, which can be real stumbling blocks. Training and help is required. Also the programmer must be willing to put in the effort.)
I also think this problem could be solved in pure Fortran by making the functions all accept double-precision, and then having single precision calls which cast/copy the data to double-precision. Since no calculations are done by the library (i.e. no hairy physics equations being solved) having single precision vs. double is not going to affect performance.
@jbathegit do you know C or C++?
The way to solve this in C would be first to write a function using parameters and void pointers, which could take any of the data sizes needed, then write simple C wrappers for each actual case needed (i.e. one for single precision one for double), then write Fortran APIs to call those C wrappers. This way the void pointer is hidden from the Fortran layer, and it just sees C functions with nice, well-defined types. All of this is 100% standard C and Fortran.
Thanks everyone for your feedback and suggestions!
First of all, and to answer the question from @edwardhartnett, yes I'm proficient in C, but not so much in C++.
Secondly, a question for @rmclaren - in your example writsa_d
, how does any actual space for the local variable _msgt
get allocated? I see where you're declaring it as an array, but I don't see where you've actually allocated it to be a certain size, so how can _writsa
safely write into that space without risking a segfault? Unless I'm missing something(?), this ends up being the same problem I had in my original writsa_d
code (see above), where I needed to dynamically allocate my_msgt
every time the function was called.
I was able to get around that by using the suggestion from @kgerheiser to incorporate an #include directive in a source file modi_writsa.F90:
module modi_writsa
interface writsa
module procedure writsa_4, writsa_8, writsa_d
end interface
contains
subroutine writsa_4( lunxx, lmsgt, msgt, msgl, xval, yval )
integer(kind=4), intent(in) :: lunxx
integer(kind=4), intent(in) :: lmsgt
integer(kind=4), intent(out) :: msgt(:)
integer(kind=4), intent(out) :: msgl
real(kind=4), intent(in) :: xval
real(kind=4), intent(out) :: yval
#include "writsa_body"
end subroutine
subroutine writsa_8( lunxx, lmsgt, msgt, msgl, xval, yval )
integer(kind=8), intent(in) :: lunxx
integer(kind=8), intent(in) :: lmsgt
integer(kind=8), intent(out) :: msgt(:)
integer(kind=8), intent(out) :: msgl
real(kind=8), intent(in) :: xval
real(kind=8), intent(out) :: yval
#include "writsa_body"
end subroutine
subroutine writsa_d( lunxx, lmsgt, msgt, msgl, xval, yval )
integer(kind=4), intent(in) :: lunxx
integer(kind=4), intent(in) :: lmsgt
integer(kind=4), intent(out) :: msgt(:)
integer(kind=4), intent(out) :: msgl
real(kind=8), intent(in) :: xval
real(kind=8), intent(out) :: yval
#include "writsa_body"
end subroutine
end module
But, as he noted, this separates the function body from the interface definition and relies on pre-processing, which in turn becomes a hairy mess for Doxygen to make sense of. Furthermore, while I was able to get a simple test code to compile and run via:
ifort -r8 -i8 modi_writsa.F90 test.f90
from the UNIX command line, I couldn't then get the same modi_writsa.F90 code to work within the existing CMake build procedure for NCEPLIBS-bufr; instead, it squawked with the following diagnostic:
undefined reference to writsa_
So I'm still kind of stuck here.
Please note that, whatever solution we come up, it has to work for any user and existing application code simply by calling the name writsa
; otherwise, if we change that interface in any way, then every existing user has to correspondingly change his or her own application code, which is something we definitely want to avoid ;-)
@edwardhartnett @kgerheiser @rmclaren @aerorahul @jack-woollen
Digging further, the undefined reference is due to the writsa
reference being defined in the interface statement, which in turn is defined inside of a module, so I just have a use modi_writsa
to bring it into scope within my simple test program. But when I'm building the NCEPLIBS-bufr library via CMake, there's no corresponding main program, and therefore no way I can think of to make the writsa
reference visible within the library itself, aside from requiring every existing user to add use modi_writsa
within their application code.
Am I missing something here? In other words, is there some straightforward way that I can make the writsa
reference in the above interface statement visible throughout the compiled library so that an end user can see it by just linking the compiled library, and without requiring them to have to specify use modi_writsa
in their application code?
There has to be a use <module>
somewhere. For users there would be a single public facing module, which re-exports all the internal modules, like:
module bufr_mod
use modi_writsa
use ...
use ...
end module bufr_mod
And internally, each piece of code would use the individual modules that it needs.
Thanks @kgerheiser, and yeah I did think of something like that, but I was hoping there might be some way for every user to not have to add any such new directive to their application code, or even worse within every individual source file of their code where any NCEPLIBS-bufr routine ever gets called.
I guess I was hoping there might be some sort of generic Fortran structure that I was overlooking and that would automatically get compiled and be in scope within the library without users having to modify their own application code in order to have visibility to that interface(?)
Anyway, thanks again for the confirmation, and if anyone else can think of a separate way around this, please let me know.
I guess my other question is a more general one for @edwardhartnett, in terms of how important this is? I get that it would be nice to be able to have one version of the compiled library that all application codes can link to (vs. separate _4, _8 and _d builds), but this is quite a bit of restructuring just for one subprogram out of many, and we haven't even addressed the issue of how Doxygen is going to behave with all of these #include *body
directives linking in all of the various code snippets from all over the place. And converting any of these interfaces to C language would be just as time-consuming, so in the long run is it really worth doing all of this? In other words, are we getting enough proverbial bang-for-the buck from this time/resource investment?
Another issue I see with a single public-facing Fortran module such as bufr_mod
is that any user trying to access NCEPLIBS-bufr from within application code written in C or Python would be out of luck. In other words, how are they going to be able to add a use bufr_mod
to their application code if their code isn't Fortran?
@jbathegit you can still interface with C by using bind(c)
and iso_c_binding
types.
@jbathegit you can still interface with C by using
bind(c)
andiso_c_binding
types.
I'm confused. Given the above example module modi_writsa
, how would I reference the name writsa
from within C code? As far as I know, I can't add the bind(c)
attribute to a name defined in a Fortran interface statement.
In other words, interface writsa bind(c)
doesn't work, so how would I do this?
How important is this? I would say quite important.
Right now we are doing this 4/8/d thing, which is a NOAA special - I've never seen anyone else do this or even try it. So we are well outside the normal with this.
Heading further down the road with 4/8/d is just putting off the problem. I have outlined a solution in C - can this work?
Before this library is refactored/rewritten, what does the future use of BUFR look like in our systems and in general with the development around us? Who are the primary users of this and the new library? What are their needs? Is this library needed in the future or can it be combined with other existing solutions?
I have heard the desire to phase out GRIB1 (although this still has not been achieved). There has been no talk of getting rid of BUFR, has there?
Our goal is to have world-class software so scientists can do world-class science. We will do whatever programming work we need to do, in order to achieve that.
If that means extra programming... well, we are programmers. It's what we do all day, every day.
BUFR is not going anywhere. I am talking about use of bufr, not about its existence.
OK, so this is our bufr solution for NOAA for the next several decades.
At the moment I am working hard on the GRIB libraries, and that will take some time. @jbathegit if you want to defer this entire conversation until I have free time to come help NCEPLIBS-bufr with this conversion away from 4/8/d, that's fine.
We are doing this for other libraries, and will learn as we go. So let's table this issue for a while until we have more experience.
Jeff, if you want to try out the C solution, I think that's where we will end up heading.
I'm wondering how many codes use the 8 or the d version. And why.
On Mon, Jan 24, 2022 at 9:02 AM Edward Hartnett @.***> wrote:
I have heard the desire to phase out GRIB1 (although this still has not been achieved). There has been no talk of getting rid of BUFR, has there?
Our goal is to have world-class software so scientists can do world-class science. We will do whatever programming work we need to do, in order to achieve that.
If that means extra programming... well, we are programmers. It's what we do all day, every day.
[image: image] https://user-images.githubusercontent.com/38856240/150796455-ccab7308-6287-4d0c-b5cd-a167d785b853.png
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1020131358, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6M757VQL6ZCSPLWOK3UXVLVVANCNFSM4TYNIUTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
@aerorahul raises a good question that I've raised myself in the past, which is what the future of BUFR looks like in our systems? He's correct that BUFR isn't going anywhere, and in fact we'll definitely continue receiving BUFR observational data from the outside world for many years/decades into the future. So we're always going to need to have software to be able to read and write BUFR when interacting with the rest of the world, because it is and will remain a WMO standard for a long time, and there's been no discussion whatsoever at the WMO level about changing that.
But as I've mentioned previously, and just because we'll have a continued need to be able to read/decode and write/encode BUFR in our interactions with the rest of the world, that doesn't mean we necessarily need to forever continue to use BUFR as our internal NCEP storage and archive format. In other words, there's no reason many of our internal codes couldn't switch to something else (direct JEDI/IODA array vectors?) if that were deemed preferable, and in that case it would greatly reduce the number of NCEP codes that we have to support with our BUFR software. But of course that's an enterprise-level decision that would have far-reaching implications into a lot of existing software and other internal practices. So, again, just throwing that out there as food-for-thought.
For my part, and to answer @jack-woollen's question, I have no idea how many existing codes or what percentage use the 8 or the d versions. I would imagine it's a fairly small number, because most BUFR values fit just fine into 32 bit fields. That said, there are a handful of newer BUFR descriptors whose range of values don't fit into 32 bits and therefore do require 8 bytes, and which is why I've suggested making that the new default and just tailoring all newly-developed interfaces to that build.
The bufrlib inputs and outputs are all already 8 bytes. You don't need different libraries for that.
On Mon, Jan 24, 2022 at 1:11 PM Jeff Ator @.***> wrote:
@aerorahul https://github.com/aerorahul raises a good question that I've raised myself in the past, which is what the future of BUFR looks like in our systems? He's correct that BUFR isn't going anywhere, and in fact we'll definitely continue receiving BUFR observational data from the outside world for many years/decades into the future. So we're always going to need to have software to be able to read and write BUFR when interacting with the rest of the world, because it is and will remain a WMO standard for a long time, and there's been no discussion whatsoever at the WMO level about changing that.
But as I've mentioned previously, and just because we'll have a continued need to be able to read/decode and write/encode BUFR in our interactions with the rest of the world, that doesn't mean we necessarily need to forever continue to use BUFR as our internal NCEP storage and archive format. In other words, there's no reason many of our internal codes couldn't switch to something else (direct JEDI/IODA array vectors?) if that were deemed preferable, and in that case it would greatly reduce the number of NCEP codes that we have to support with our BUFR software. But of course that's an enterprise-level decision that would have far-reaching implications into a lot of existing software and other internal practices. So, again, just throwing that out there as food-for-thought.
For my part, and to answer @jack-woollen https://github.com/jack-woollen's question, I have no idea how many existing codes or what percentage use the 8 or the d versions. I would imagine it's a fairly small number, because most BUFR values fit just fine into 32 bit fields. That said, there are a handful of newer BUFR descriptors whose range of values don't fit into 32 bits and therefore do require 8 bytes, and which is why I've suggested making that the new default and just tailoring all newly-developed interfaces to that build.
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1020394123, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6PG2T6ZSIUMDSOZLOTUXWI5TANCNFSM4TYNIUTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
The bufrlib inputs and outputs are all already 8 bytes. You don't need different libraries for that.
@jack-woollen please clarify what you mean by this. Specifically, when a user application calls, e.g. SUBROUTINE WRITSB(LUNIT)
, how does the subroutine know whether LUNIT was declared (or compiled) as a 4-byte or 8-byte integer in the calling program? This isn't mandated in the library itself, which just lets it be implicitly declared as an integer. Similarly, when an application code calls FUNCTION IUPBS01(MBAY,S01MNEM)
, the array MBAY is just implicitly declared internally as DIMENSION MBAY(*)
, so how does the function know whether each array member is 4 or 8 bytes long in the calling program? Since we never defined these interfaces explicitly in the past, that's part of the reason we now have this ambiguity.
Unless I'm missing something, the only things that are explicitly declared as 8 bytes within the library itself are the REAL values arrays that get passed into and out of subroutines such as UFBINT
, UFBREP
, etc. Pretty-much every other integer or real call parameter to the library doesn't have an explicit length specified.
@Jeff Ator - NOAA Federal @.***>
Okay, two different things.
1) When you say " there are a handful of newer BUFR descriptors whose range of values don't fit into 32 bits and therefore do require 8 bytes, and which is why I've suggested making that the new default and just tailoring all newly-developed interfaces to that build", I'm just pointing out that the bufr range of values which need 8 bytes to fit in are already taken care of by the 8byte arrays which go into or out of the bufrlib calls in each of the libraries now, unless I'm missing something. And by the way, if you are suggesting only maintaining bufrlib_d or bufrlib_8, it seems to me that makes more trouble than it saves. I think bufrlib_4 is the one to focus on.
2) As for the unit numbers and dimensions, return codes, etc, they can be explicitly defined as 4 bytes in "pure" 8 byte app codes. They will probably never need 8 bytes themselves. The mbay array does have an allocated dimension, which I assume could be used instead of '*'. Bottom line, I think that reduction to just the 4 byte bufrlib would require a smaller amount of changes, in a smaller number of codes, than other alternatives which have been discussed.
On Mon, Jan 24, 2022 at 2:39 PM Jeff Ator @.***> wrote:
The bufrlib inputs and outputs are all already 8 bytes. You don't need different libraries for that.
@jack-woollen https://github.com/jack-woollen please clarify what you mean by this. Specifically, when a user application calls, e.g. SUBROUTINE WRITSB(LUNIT), how does the subroutine know whether LUNIT was declared (or compiled) as a 4-byte or 8-byte integer in the calling program? This isn't mandated in the library itself, which just lets it be implicitly declared as an integer. Similarly, when an application code calls FUNCTION IUPBS01(MBAY,S01MNEM), the array MBAY is just implicitly declared internally as DIMENSION MBAY(*), so how does the function know whether each array member is 4 or 8 bytes long in the calling program? Since we never defined these interfaces explicitly in the past, that's part of the reason we now have this ambiguity.
Unless I'm missing something, the only things that are explicitly declared as 8 bytes within the library itself are the REAL values arrays that get passed into and out of subroutines such as UFBINT, UFBREP, etc. Pretty-much every other integer or real call parameter to the library doesn't have an explicit length specified.
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1020472360, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6KGXBGUWWYQUVC2LW3UXWTFJANCNFSM4TYNIUTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
just because we'll have a continued need to be able to read/decode and write/encode BUFR in our interactions with the rest of the world, that doesn't mean we necessarily need to forever continue to use BUFR as our internal NCEP storage and archive format. In other words, there's no reason many of our internal codes couldn't switch to something else (direct JEDI/IODA array vectors?) if that were deemed preferable, and in that case it would greatly reduce the number of NCEP codes that we have to support with our BUFR software. But of course that's an enterprise-level decision that would have far-reaching implications into a lot of existing software and other internal practices.
Even if we still use BUFR files for archival/direct input to JEDI/IODA, that should still greatly simplify the codes that use BUFR libraries, no? We are already in the process of removing preprocessing (prepobs, etc.) code to put everything in JEDI UFO, so I think it's important to look at this with the lens of one unified software system (written in C++) will handle observation ingest once legacy systems are retired. The only other use that I could think of would be verification/MET+. Tagging @ADCollard @emilyhcliu for awareness/input.
Also to comment on @edwardhartnett 's quote "OK, so this is our bufr solution for NOAA for the next several decades." I think that is a chilling thought. This current BUFR library has pieces old enough to run for senate. This warrants thought on what the BUFR library at NCEP will look like in 2040.
1) When you say " there are a handful of newer BUFR descriptors whose range of values don't fit into 32 bits and therefore do require 8 bytes, and which is why I've suggested making that the new default and just tailoring all newly-developed interfaces to that build", I'm just pointing out that the bufr range of values which need 8 bytes to fit in are already taken care of by the 8byte arrays which go into or out of the bufrlib calls in each of the libraries now, unless I'm missing something. And by the way, if you are suggesting only maintaining bufrlib_d or bufrlib_8, it seems to me that makes more trouble than it saves. I think bufrlib_4 is the one to focus on.
One issue is that, even though the REAL*8 USR arrays are explicitly declared as 8-byte, a lot of the behind-the-scenes stuff is still just implicitly-defined integers. For example, in subroutines RDTREE
and WRTREE
, any such REAL*8 values are unpacked and packed internally via the implicitly-defined IVAL array, so in 4-byte integer builds we still can't handle any value that needs more than 32 bits, even if the USR arrays themselves are REAL*8. In this case, I could easily change moda_ival.F
to explicitly declare IVAL as INTEGER*8 or INTEGER(kind=8), but there are probably numerous other such bugaboos sprinkled throughout the library as well, and which is why I really think bufrlib_8 is the build to focus on, not bufrlib_4.
2) As for the unit numbers and dimensions, return codes, etc, they can be explicitly defined as 4 bytes in "pure" 8 byte app codes. They will probably never need 8 bytes themselves. The mbay array does have an allocated dimension, which I assume could be used instead of '*'. Bottom line, I think that reduction to just the 4 byte bufrlib would require a smaller amount of changes, in a smaller number of codes, than other alternatives which have been discussed.
Maybe I'm not understanding you correctly, but it sounds like you're suggesting the onus in the future should be on all application codes to explicitly declare whether or not all of their integers are 4 or 8 bytes. I think the goal should be to not force every user to have to make wholesale changes to his/her application codes, but rather to handle such variances internally within a single build of the library itself. In my mind that's a smaller overall impact, at least in terms of overall amount of source code that would potentially need to be modified.
Also to comment on @edwardhartnett 's quote "OK, so this is our bufr solution for NOAA for the next several decades." I think that is a chilling thought. This current BUFR library has pieces old enough to run for senate. This warrants thought on what the BUFR library at NCEP will look like in 2040.
I hear you @CoryMartin-NOAA, and a completely new library could certainly be developed in a more modern language and with a more modern design, so I'll reply by just making the same point I made several months ago when someone else suggested the same thing. Simply put, and before we collectively go down that road, just please don't underestimate the amount of effort that would be involved in developing a brand new BUFR library from scratch, and which has the same features and capabilities that our users rely on within the current library. BUFR is a very involved format with a lot of nuances, so we all (and NCEP upper management as well! ;-) would need to be on board with an undertaking of that magnitude.
After some dinner and a little more thought I did realize the real numbers in the bufrlib internals should all be 8 bytes. But, I don't think that's equally true for integers, so I'd change my focus to bufrlib_d. In terms of numbers of codes that wouldn't need to change, bufrlib_d would likely work as is with all current bufrlib_4 users as well as all current bufrlib_d users. That has to be the largest number of app codes by far.
My point 2, rephrased for a bufrlib_d focus, is just that any codes that now use bufrlib_8 would only have to specify integer arguments to bufrlib routines as integer(4), which doesn't seem to be too great a burden. As you say there are some integers internal to bufrlib that should be fixed to have 8 bytes. I'm happy to look for all bugaboos involving 4 byte integers which should be specified as 8 byte integers, because I don't think there are many of them, and they should all be identified in any case.
Or, there may even be a way to make the user callable routines insensitive to the integer argument sizes. I can think of some possibilities on that, which if it worked, could mean a modified version of bufrlib_8 could replace all current versions without any app code changes at all. That would be the best.
On Mon, Jan 24, 2022 at 5:56 PM Jeff Ator @.***> wrote:
- When you say " there are a handful of newer BUFR descriptors whose range of values don't fit into 32 bits and therefore do require 8 bytes, and which is why I've suggested making that the new default and just tailoring all newly-developed interfaces to that build", I'm just pointing out that the bufr range of values which need 8 bytes to fit in are already taken care of by the 8byte arrays which go into or out of the bufrlib calls in each of the libraries now, unless I'm missing something. And by the way, if you are suggesting only maintaining bufrlib_d or bufrlib_8, it seems to me that makes more trouble than it saves. I think bufrlib_4 is the one to focus on.
One issue is that, even though the REAL8 USR arrays are explicitly declared as 8-byte, a lot of the behind-the-scenes stuff is still just implicitly-defined integers. For example, in subroutines RDTREE and WRTREE, any such REAL8 values are unpacked and packed internally via the implicitly-defined IVAL array, so in 4-byte integer builds we still can't handle any value that needs more than 32 bits, even if the USR arrays themselves are REAL8. In this case, I could easily change moda_ival.F to explicitly declare IVAL as INTEGER8 or INTEGER(kind=8), but there are probably numerous other such bugaboos sprinkled throughout the library as well, and which is why I really think bufrlib_8 is the build to focus on, not bufrlib_4.
- As for the unit numbers and dimensions, return codes, etc, they can be explicitly defined as 4 bytes in "pure" 8 byte app codes. They will probably never need 8 bytes themselves. The mbay array does have an allocated dimension, which I assume could be used instead of '*'. Bottom line, I think that reduction to just the 4 byte bufrlib would require a smaller amount of changes, in a smaller number of codes, than other alternatives which have been discussed.
Maybe I'm not understanding you correctly, but it sounds like you're suggesting the onus in the future should be on all application codes to explicitly declare whether or not all of their integers are 4 or 8 bytes. I think the goal should be to not force every user to have to make wholesale changes to his/her application codes, but rather to handle such variances internally within a single build of the library itself. In my mind that's a smaller overall impact, at least in terms of overall amount of source code that would potentially need to be modified.
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1020631854, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6NHEA3KBLMZE2T5H5DUXXKHXANCNFSM4TYNIUTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Wow, lots of great discussion here. To address a few strategic points:
There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:
It’s harder to read code than to write it.
I urge everyone to read the entire blog post and try to understand it. Although this blog post is 20 years old, it is just as valid today.
This is the way many libraries are managed, and it will work for us too. The NetCDF library contains a lot of code from the 1980s, but that's OK because it still works, is well-tested and well-documented, and is structured to work well in the future.
That is what we will achieve with NCEPLIBS-bufr.
As I reread all the discussion here, it's starting to seem like the C solution is the way forward. If this cannot be done easily in native Fortran, it certainly can be done in C with Fortran wrappers...
I think that would be a mistake and a huge waste of time and effort. But, what do I know, right.
@aerorahul can you please describe which idea you think would be a huge waste of time and effort?
@edwardhartnett JEDI IODA 'format' is either netCDF4/HDF5 or "ODB" (developed by ECMWF) with a particular group/file structure. It can be read/written by any HDF5/netCDF4 tool as far as I am aware.
We 1000% need BUFR for the foreseeable future. It is the WMO standard for observation storage/transmission.
Regarding 'backwards-compatibility', I ask, who are the users of NCEPLIBS-bufr? Are there others besides GSI and obsproc_prep? Both of these things are being completely replaced/overhauled in the next couple of years. Backwards compatibility is a good (great?) idea overall, but it introduces limitations and is not always practical. For GFSv16, @edwardhartnett as you may recall, we switched to netCDF I/O for the model and data assimilation and post processing. That required more changes than a new BUFR library would in GFS, so this is not without precedent. @rmclaren has been trying to generalize BUFR read in JEDI and has encountered many limitations in the current library. I learned on Fortran, and only know a little C++, but I see where both have their strengths and disadvantages. JEDI uses both C++ and Fortran 2003, which allows low level routines to be forklifted and unmodified while higher level things can be more modular. Perhaps that is an approach to consider.
OK, we all agree that BUFR format is here to stay.
The NCEPLIBS group has no mandate or interest in trying to rewrite NCEPLIBS-bufr from scratch, so if anyone wants to attempt that, we wish them luck but cannot offer much programming help. And any such replacement library would have to reach a high state of readiness before we could switch to it, which is not likely to happen soon, or at all. So NCEPLIBS-bufr is also here to stay.
@CoryMartin-NOAA thanks for the clarification of the JEDI IODA format. I would be interested in seeing what exactly it means, since it is using netCDF/HDF5. Is there some documentation of IODA on-line?
I would agree that any application that is using BUFR for an internal format would do well to switch to netCDF/HDF5. It's much easier for everyone else to deal with NetCDF files! But not all uses of NCEPLIBS-bufr are for internal data systems. We must read and write BUFR in at least some places, and that requirement seems firm.
Since NCEPLIBS-bufr is therefore going to be our BUFR solution for the coming decades, then it is worth a great deal of time and attention, and we will incrementally make it everything we need to be, in order to have the world-class software we desire. @aerorahul I certainly don't consider this a waste of time and effort - I consider it our job. ;-) And improving a necessary operational software library for the UFS and other important applications is certainly worthy of our efforts.
Which means we must, to return to the issue, fix the 4/8/d problem. Even if that takes a lot of work. I am open to a fully-Fortran solution if it can be found.
WRT current users and backward compatibility - from my past experience with netCDF and other libraries, I am confident that we can manage our incremental improvements to preserve full backward compatibility. If that has to be violated on some occasions, we can discuss on a case-by-case basis. But generally, I have found that new ideas and code can be accommodated, while still allowing existing users to seamlessly upgrade to new releases.
@CoryMartin-NOAA and @rmclaren we must cope with the limitations you encounter one-by-one. Whatever can be achieved by a complete re-write can also be achieved by incremental improvements.
@edwardhartnett see here: https://jointcenterforsatellitedataassimilation-jedi-docs.readthedocs-hosted.com/en/latest/inside/jedi-components/ioda/index.html and https://jointcenterforsatellitedataassimilation-jedi-docs.readthedocs-hosted.com/en/latest/inside/conventions/files_and_components.html
My $0.02 is that we need a library that can be flexible enough to read any BUFR file with minimal duplication of code for different file types and the user need only provide mnemonics and it returns arrays. I don't really care to dictate the solution, but I think we should probably think about the requirements further before any reengineering happens. The biggest advantage to JEDI over GSI (in my opinion) is removing all the redundant, sensor-specific code. There should only be one read routine, not N routines where N=the number of observation types. However we can get to that is the right path forward.
In my opinion, the biggest disadvantage to JEDI over GSI is that users have to carry a huge load of hard-to-manage yaml settings for every single run.
@jamiebresch I don't disagree. Individually, YAML files are great. But the 50k line concatenated monster that goes into the executables is less than ideal. For @rmclaren 's converter, the YAML is straightforward, but the UFO QC YAML is often unwieldy. But that's a discussion for a different venue.
@jbathegit @jack-woollen What are the currently used applications (that you know of) that leverage the NCEPLibs-bufr library. I am aware of the following:
As @CoryMartin-NOAA says, JEDI is replacing GSI and a significant (if not all, but surely a lot) of Observation Processing. So who is this refactored library going to serve and in what capacity?
Regardless of choice of language, how does the future library for encoding and decoding of BUFR messages look like in this universe? Who are your customers and what are their needs?
@aerorahul I'm not sure what refactored library you're referring to. There's been so many of them being imagined. The current bufrlib seems mostly sufficient to handle foreseeable future needs for BUFR processing within NCEP, especially if the future looks anything like the picture being painted in this ongoing conversation. Incremental improvements to any future version of bufrlib will be necessary even so, to deal with WMO BUFR evolution and associated new BUFR dataset structures, at least. I believe in looking ahead, and stich-ing in time, but I don't believe in fixing things that aren't broken, at least not without a good reason. Optimizing operation and economizing maintenance both count as good reasons. Case in point, consolidating 3 versions into 1 is an example of something worthwhile.
@jack-woollen I agree this conversation has derailed from consolidating 3 versions 8|4|d to a restructuring/refactoring of the library and its use in the post GSI/obsproc and into the JEDI world.
Hi All,
Cory added me to this thread earlier today.
It seems to me there are a number of answered questions about how we should proceed with NCEPBUFR support and, speaking as an end-user working on the JEDI/IODA conversion, I think it would be helpful to me to understand the big picture better.
Would people be amenable to a meeting sometime in the next couple of weeks to discuss this?
Andrew.
On Tue, Jan 25, 2022 at 2:30 PM Rahul Mahajan @.***> wrote:
@jack-woollen https://github.com/jack-woollen I agree this conversation has derailed from consolidating 3 versions 8|4|d to a restructuring/refactoring of the library and its use in the JEDI world.
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1021534716, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJTUMJF2JCCFSM5O5PCTHNDUX325LANCNFSM4TYNIUTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
I agree that a lot of fascinating discussions have been happening in this thread and it will be useful to have a meeting to discuss these. However, to precede that it would be useful to have a few requirements laid out. One requirement is obvious (at least to me) and that is to handle the 4/8/d builds with one source. However, there are probably a lot of requirements from the JEDI side that would be useful to document and then we can develop a roadmap
On the main topic of consolidating the 4/8/d builds into one, I've put some more effort into this, and I've figured out a Fortran solution that's based on the example shown above, but which doesn't require pre-processing and instead just keeps the main body encapsulated in the module as a separate, private subroutine. See https://github.com/NOAA-EMC/NCEPLIBS-bufr/blob/onebuild/src/writsa.f90 for more details of what this looks like, and note that this does still require every application program to include a new use <module>
statement as noted earlier, since the subroutine name (in this case, writsa
) can now only be referenced via a Fortran module. But it does work and seems like a fairly clean solution. And note that if we end up using _8 as the eventual single build of the library, then lines 154-158 and 162-163 can be removed as well.
For C interoperability, my thinking was to expand on the work that @rmclaren did previously with bufr_interface.f90
, meaning that for all of the routines that need to be callable from C, we'd define the C<->Fortran interface within this file, then add a use <module>
statement as noted above to expose all of the Fortran subprogram names that can now only be referenced via a module. This would have the added benefit of also addressing the long-standing UNDERSCORE issue noted in #20 and #79. I tried creating a prototype by coding up a ufbint.f90 routine similar to the above writsa.f90 model; however, despite numerous attempts I couldn't get the syntax quite right so that bufr_interface.f90
could now call the ufbint
reference in this new module. I think I'm close, but I may need some help from @rmclaren or others to resolve this.
In summary, I think this approach is doable, though it will take quite a bit of time to extend to all of the different subprograms in the library, and there's also the question of Python interoperability, if all of the Fortran subprogram names can now only be referenced via a Fortran module going forward. I'll likely need some help with that as well.
Some more details on the aforementioned bufr_interface.f90
issue. Here's how it tries to call ufbint
via a wrapper:
!> @author Ronald McLaren
!> @date 2020-07-29
!>
!> @brief Wraps BUFRLIB "ufbint" function.
!>
!> @param[in] bufr_unit - c_int: the fortran file unit number to read from
!> @param[inout] c_data - c_ptr: c style pointer to a pre-allocated buffer
!> @param[in] dim_1, dim_2 - c_int: dimensionality of data to read or write
!> @param[out] iret - c_int: return value, length of data read
!> @param[in] table_b_mnemonic - c_char: string of mnemonics
!>
subroutine ufbint_c(bufr_unit, c_data, dim_1, dim_2, iret, table_b_mnemonic) bind(C, name='ufbint_f')
integer(c_int), value, intent(in) :: bufr_unit
type(c_ptr), intent(inout) :: c_data
integer(c_int), value, intent(in) :: dim_1, dim_2
integer(c_int), intent(out) :: iret
character(kind=c_char, len=1), intent(in) :: table_b_mnemonic
real, pointer :: f_data
call c_f_pointer(c_data, f_data)
call ufbint(bufr_unit, f_data, dim_1, dim_2, iret, c_f_string(table_b_mnemonic))
end subroutine ufbint_c
so I initially set up the new ufbint.f90 with an internal ufbint_4_d interface similar to the above writsa.f90 (since c_int is equivalent to a 4-byte integer) and added the corresponding use <module>
statement, but it still couldn't access ufbint
that way. I then tried adding a new ISO C interface to ufbint.f90 to more directly match up the types for this particular call:
subroutine ufbint_isoc( lunin, usr, i1, i2, iret, str )
!> used when call arguments to ufbint are ISO C types
use iso_c_binding
implicit none
integer(c_int), value, intent(in) :: lunin
integer(c_int), value, intent(in) :: i1, i2
integer(c_int), intent(out) :: iret
real, pointer :: usr
character*(*) str
integer :: my_lunin, my_i1, my_i2, my_iret
my_lunin = lunin
my_i1 = i1
my_i2 = i2
call ufbint_body( my_lunin, usr, my_i1, my_i2, my_iret, str )
iret = my_iret
end subroutine ufbint_isoc
but that didn't work either and complained of a type mismatch in the dummy arguments. So that's where I'm stuck at the moment, in case anyone sees anything obvious that I'm missing here? ;-)
Is it an issue with calling the generic interface (ufbint
)? I have found that calling the specific function manually will give better hint at what the actual mismatch is because the compiler knows what the arguments are supposed to be, other than "it doesn't match one of these preset lists exactly".
In your second example, why is str
not character(kind=c_char, len=1), intent(in) :: str
? You need to be careful because C strings and Fortran strings aren't represented the same way in memory (they aren't compatible). This is why I made the c_f_string
function (to convert between them). Also, do you need to do something like use subroutine_ufbint
since the interface is defined within a module (both examples)?
Thanks for the replies @kgerheiser and @rmclaren. Yes, I had been including use subroutine_ufbint
inside of bufr_interface.f90
, but now I'm wondering if that may not have really been needed. Either way, I started trying out some of the suggestions, but WCOSS2 access has been down for the past couple of hours so progress is halted for now. Once it's back up I'll run some more tests, including calls to the specific function rather than the interface, and share the results.
So here's where I'm at right now...
in bufr_interface.f90:
use iso_c_binding
use bufr_procedures
!> @author Ronald McLaren
!> @date 2020-07-29
!>
!> @brief This function turns a c string into a fortran string.
!>
!> @param[in] c_str - c_char: pointer to a \0 (null) terminated c string
!> @param[out] f_str - character(:): fortran string
!>
function c_f_string(c_str) result(f_str)
character(kind=c_char,len=1), intent(in) :: c_str(*)
character(len=:), allocatable :: f_str
integer :: nchars
nchars = 1
do while (c_str(nchars) /= c_null_char)
nchars = nchars + 1
end do
nchars = nchars - 1
allocate(character(len=nchars) :: f_str)
f_str = transfer(c_str(1:nchars), f_str)
end function c_f_string
!> @author Ronald McLaren
!> @date 2020-07-29
!>
!> @brief Wraps BUFRLIB "ufbint" function.
!>
!> @param[in] bufr_unit - c_int: the fortran file unit number to read from
!> @param[inout] c_data - c_ptr: c style pointer to a pre-allocated buffer
!> @param[in] dim_1, dim_2 - c_int: dimensionality of data to read or write
!> @param[out] iret - c_int: return value, length of data read
!> @param[in] table_b_mnemonic - c_char: string of mnemonics
!>
subroutine ufbint_c(bufr_unit, c_data, dim_1, dim_2, iret, table_b_mnemonic) bind(C, name='ufbint_f')
integer(c_int), value, intent(in) :: bufr_unit
type(c_ptr), intent(inout) :: c_data
integer(c_int), value, intent(in) :: dim_1, dim_2
integer(c_int), intent(out) :: iret
character(kind=c_char, len=1), intent(in) :: table_b_mnemonic
real(c_double), pointer :: f_data
call c_f_pointer(c_data, f_data)
call ufbint_isoc(bufr_unit, f_data, dim_1, dim_2, iret, c_f_string(table_b_mnemonic))
end subroutine ufbint_c
where right now I'm just trying to get it to connect directly with the ufbint_isoc subroutine, never mind trying to go through the generic ufbint
interface reference.
in ufbint.f90:
subroutine ufbint_isoc( lunin, usr, i1, i2, iret, str )
! used when call arguments to ufbint are ISO C types
use iso_c_binding
implicit none
integer(c_int), value, intent(in) :: lunin
integer(c_int), value, intent(in) :: i1, i2
integer(c_int), intent(out) :: iret
real(c_double), pointer, intent(inout) :: usr
! real(c_double), intent(inout) :: usr(i1,i2)
! character(c_char, len=1), intent(in) :: str
character(len=*), intent(in) :: str
integer :: my_lunin, my_i1, my_i2, my_iret
my_lunin = lunin
my_i1 = i1
my_i2 = i2
call ufbint_body( my_lunin, usr, my_i1, my_i2, my_iret, str )
iret = my_iret
end subroutine ufbint_isoc
where ufbint_body is the same code as the existing ufbint.f.
But here's the compiler error I'm seeing:
/lfs/h2/emc/obsproc/noscrub/Jeff.Ator/NCEPLIBS-bufr-GitHub/nceplibs-bufr/src/ufbint.f90(242): error #8284: If the actual argument is scalar, the dummy argument shall be scalar unless the actual argument is of type character or is an element of an array that is not assumed shape, pointer, or polymorphic. [USR]
call ufbint_body( my_lunin, usr, my_i1, my_i2, my_iret, str )
-------------^
compilation aborted for /lfs/h2/emc/obsproc/noscrub/Jeff.Ator/NCEPLIBS-bufr-GitHub/nceplibs-bufr/src/ufbint.f90 (code 1)
make[2]: *** [src/CMakeFiles/bufr_8_f.dir/build.make:933: src/CMakeFiles/bufr_8_f.dir/ufbint.f90.o] Error 1
and I see this regardless of whether I use the declaration
real(c_double), pointer, intent(inout) :: usr
or
real(c_double), intent(inout) :: usr(i1,i2)
in ufbint.f90, and regardless of whether I use
character(kind=c_char, len=1), intent(in) :: str
or
character(len=*), intent(in) :: str
Can anyone spot anything obvious that I'm doing wrong here? I know we'd all like to find a way to get this to work so we can just have one build rather than three separate 4/8/d builds. I feel like I'm really close, but I've tried all kinds of different options and I just can't quite get this to work. Note from above that I was able to get this same sort of thing to work for writsa.f90, so the concept does work, but there's something in this call to ufbint from the bufr_interface.f90 module that's not quite right.
(Thanks! ;-)
This will come after #77
Currently we build 3 flavors of the library, 4, 8, and D. Instead, use fortran interfaces and kinds to provide one library that handles all three cases.
Not sure when this should happen, we want to try this out on a simpler project first.