NOAA-EMC / NCEPLIBS-g2

Utilities for coding/decoding GRIB2 messages.
Other
6 stars 15 forks source link

organize g2 subroutines in a module #624

Open edwardhartnett opened 7 months ago

edwardhartnett commented 7 months ago

With recent changes to functions due to handling > 2 GB files, many users will have to make code changes to take advantage of the new features. Unfortunately, not all features can be introduced without API changes. (In this case, the API has to change to handle an index version, so that users can continue to produce version 1 index files for full backward compatibility, but switch to index version 2 when they may have files > 2 GB.)

Given that code changes are going to be required, they can be minimized by putting the g2 subroutines in a g2 module.

This will require users to add use g2 to the top of their programs and subprograms which use the g2 library. They will also (probably) have to remove any interface statements that they have used for g2 subroutines (and many require the use of interface statements).

I regret that users must make changes for the next release of g2, but introducing a module now seems like the easiest path forward for users. With a module I can hide many of the index version parameters as optional parameters, which means existing code continues to work fine, and only an optional parameter has to be added to user code to take advantage of the new features. That's about the lowest impact we can get away with here, and maintain full compatibility for existing workflows that may use the version 1 format directly.

The module is also useful for many other reasons, including the fact that it provides interfaces for all g2 subroutines, saving the user the trouble in many cases, and providing better compile-time checking or parameters.

@AlexanderRichert-NOAA @Hang-Lei-NOAA @GeorgeVandenberghe-NOAA @GeorgeGayno-NOAA any thoughts on adding modules to our NCEPLIBS libraries?

edwardhartnett commented 6 months ago

This is hard.

The reason is that many subroutine calls in the library generate type mismatches when they have access to the subroutine interface in the module. Very common are calls where a scalar is used, instead of an array of size 1. In other cases, an array(2) is passed, meaning that the first element of the array should be skipped, but this is a violation of the interface as well.

Fortran is a very strongly typed language. Programmers that wish to program in C should do so. Programmers who are programming in Fortran should follow the rules of Fortran, even if it is incredibly inconvenient, or else they should just write wrappers around C functions which do the heavy lifting. (And this is the ultimate answer.)

I will defer this idea until the next release, but much progress was made in reducing warnings...

GeorgeVandenberghe-NOAA commented 6 months ago

I guess to stir a pot, why should we program with an API that encourages type ambiguity (some F90 getarounds and general in F77) or is not typed (straight C)? What are the advantages other than momentary convenience to get something out the door?. And I'm asking this as someone who admits abusing alligator clips.

On Tue, Mar 19, 2024 at 4:21 PM Edward Hartnett @.***> wrote:

This is hard.

The reason is that many subroutine calls in the library generate type mismatches when they have access to the subroutine interface in the module. Very common are calls where a scalar is used, instead of an array of size

  1. In other cases, an array(2) is passed, meaning that the first element of the array should be skipped, but this is a violation of the interface as well.

Fortran is a very strongly typed language. Programmers that wish to program in C should do so. Programmers who are programming in Fortran should follow the rules of Fortran, even if it is incredibly inconvenient, or else they should just write wrappers around C functions which do the heavy lifting. (And this is the ultimate answer.)

I will defer this idea until the next release, but much progress was made in reducing warnings...

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-g2/issues/624#issuecomment-2007615080, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FRPKOVIHMMJ7GJB6YTYZBQYXAVCNFSM6AAAAABEHFR5ACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBXGYYTKMBYGA . You are receiving this because you were mentioned.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

edwardhartnett commented 6 months ago

Excellent questions @GeorgeVandenberghe-NOAA !

Fortran is good for math, not for bit-fiddling. C is good for bit-fiddling, but harder for scientists to learn. (These days, it does math just as fast as Fortran, I believe.)

So the answer is to have (most) scientists do the math in Fortran, but have C programmers to do the bit-fiddling. In practice, this is what we do with netCDF and HDF5, but the programmers are not within NOAA.

For GRIB2, we have similar implementations in C and Fortran. I added index version 2 to both of them to support files > 2 GB. For C, it took a day and a half. For Fortran, it's been 5 weeks and counting. In C, very few changes are required. In Fortran, many, many are required.

In the next release of g2 (3.5.0), the duplicate functionality will be removed, and g2 will depend on g2c.

edwardhartnett commented 4 months ago

I think my next move for this issue is to put the gbyte subroutines in a module, and start using that everywhere within the code. This will allow me to remove all the gbyte interface statements and use use g2bytes instead.