add NaN check to wrtree

jbathegit commented 6 months ago

With NCEPLIBS-bufr versions <= 11.7.0, any NaN values that were passed to any of the writing/encoding subroutines would automatically get encoded into BUFR as "missing", which is what we'd expect.

However, we've now discovered that, with the changes in versions >= 11.7.1, any NaN values are now getting encoded into BUFR as 0.0. This PR fixes that bug.

jbathegit commented 6 months ago

BTW, since the fix is in subroutine wrtree, it will work for any value stored in the internal val(*,*) arrays by any of the subroutines or functions which write BUFR data values, and it will work for both compressed and uncompressed data subsets.

jbathegit commented 6 months ago

Good question Jack, and I did give that a lot of thought, especially since ibfms() is called from within that same block in wrtree(). And it also would have been a lot easier to test with a direct call to ibfms() in any one of the test codes, rather than an indirect test via ufbint() and writsb() in outtest6.

But the more I thought about it, and while ibfms() is called directly in some places from inside the library, its primary use is by countless application programs for checking data values that get returned by any of the ufbint(), ufbrep(), ufbseq(), getvalnb(), etc. reader routines. In other words, it's intended mostly for use when reading/decoding data, rather than when writing/encoding data. And I couldn't think of a single situation where a decoded data value returned by the library could ever possibly be NaN, so why include it in ibfms() where now every application program would have to cycle through that additional IF test every time they called the function? Instead, the real risk is that some application program might pass a NaN value into one of the above routines when trying to write/encode BUFR output, so why not just put this new NaN check directly inside the library, in the one place where every such value always has to pass through no matter which routine was called to pass it in, and no matter what the requested compression status is for the output subsets.

In any case, that was my thought process, but if you strongly disagree or can think of anything I might have overlooked, then please chime in and let me know - thanks!

jack-woollen commented 6 months ago

Fair enough, although I think the use of ifbms isn't as widespread as to make it a potential cpu hog. That could be tested fairly easily. Regardless, it occurs to me that users might like to know if they are passing NaNs around in their data arrays. Maybe a little message should be given to alert them to the fact?

jbathegit commented 6 months ago

The only issue with adding a message would be that it wouldn't get diagnosed until after the user subsequently calls writsb() or writsa(), whereas they would have actually stored the value back during some prior call to ufbint(), ufbrep(), ufbseq(), etc. So it would be a bit disjointed. We'd also have to separate the following new test:

    IF( (IBFMS(VAL(N,LUN)).EQ.1) .OR.
.        (VAL(N,LUN).NE.VAL(N,LUN)) ) THEN
       IVAL(N) = -1

into two IF separate tests:

    IF (IBFMS(VAL(N,LUN)).EQ.1) THEN
       IVAL(N) = -1
    ELSEIF (VAL(N,LUN).NE.VAL(N,LUN)) THEN
       IF (IPRT.EQ.2) THEN
           (print some diagnostic)
       ENDIF
       IVAL(N) = -1

because otherwise we'd have no way to know if it was really a NaN or just a regular "missing" value. I'm not saying it couldn't be done, but just not sure if it's worthwhile or how useful it would necessarily be. But if you really see some value in that, then sure I could modify wrtree() with another commit to do that. Or if you believe that sort of thing should really be in ibfms() instead of wrtree(), then I could do that as well. Just let me know.

jack-woollen commented 6 months ago

The more I thought about it the more I wonder how 'not a number' values are getting into writing routines to begin with. I think that's a bug that should be addressed. Is it from uninitialized memory, or what? Anything that goes into the val array is either read in from rdtree, or read/written from ufbint or another user interface. Is that were the NaNs come from? I hope not. Seems to me any NaN detected by any means should be alerted without any print filter.

I guess I wonder where you noticed that NaN values were being converted to zero or to missing, as you mentioned earlier. I would want to track them down to a source. Can you show a case where that happened?

jbathegit commented 6 months ago

We have decoders that read netCDF data files from MADIS and convert them to BUFR. Some of those netCDF files contain NaN values in them, and which then end up getting passed directly into NCEPLIBS-bufr. In the past we hadn't even noticed this because NCEPLIBS-bufr would automatically convert them to "missing", at least back prior to v11.7.1. But with the newer versions it now converts them to 0.0 instead.

We only just noticed this for the mesonet data because we just made an update to that decoder, including to start using NCEPLIBS-bufr v12.0.0. I can point you to a sample netCDF file if you want to ncdump it yourself to see, but this is likely happening in several other MADIS data types as well and so would potentially affect a number of our decoders. And so bottom line, I figured this is something that it made sense to take care of inside of NCEPLIBS-bufr anyway, in case it ever sees NaN values similarly passed in via other data sources.

jbathegit commented 6 months ago

Thanks Jack, and I've saved a sample file aside for you in /lfs/h2/emc/obsproc/noscrub/jeff.ator/for_JWoollen on cactus and dogwood. It contains a copy of the original MADIS netcdf file as 20230531_1600.1620, along with another file containing the output obtained by running the ncdump utility on the first file. In the latter file you can see the values in question if you just do a search for "NaN".

NOAA-EMC / NCEPLIBS-bufr

add NaN check to wrtree #550