NOAA-EMC / NCEPLIBS-bufr

The NCEPLIBS-bufr library contains routines and utilites for working with the WMO BUFR format.
Other
44 stars 19 forks source link

more bort tests #399

Closed edwardhartnett closed 1 year ago

edwardhartnett commented 1 year ago

Part of #381 Part of #401

More bort tests.

Also streamline shell script and now test against all three kinds of library.

edwardhartnett commented 1 year ago

OK, there's more bort testing to be had, but let's get this PR reviewed and merged, and I'll continue...

edwardhartnett commented 1 year ago

THis PR is ready to merge I believe...

jbathegit commented 1 year ago

Hey Ed, I just explained in #401 why your nemtbax() test wasn't calling bort() like you expected. And getting it to call bort() for the two particular cases where it would actually bort() (i.e. for an invalid message type or subtype) isn't something you'd be able to do within a simple one-line call from your run_test_bort.sh script, because it would involve having already read in a bad internal Table A as well.

So do you want to just remove that nemtbax() test altogether now? Again, what you have now in test_bort.F90 (calling nemtbax() with the 'DUMB' argument) will never trigger a bort() call, so I don't see any reason to leave that in there.

edwardhartnett commented 1 year ago

Yes, I will remove nembax() from test_bort.F90...

edwardhartnett commented 1 year ago

OK, take another look at see if we are ready to merge...

jbathegit commented 1 year ago

OK, cool, and I'll go ahead and approve and merge this now.

Following that, and just for more direct communication/coordination purposes, how about I:

  1. make the changes you requested to #406, including the direct iostat= check for the read in strnum.F90 and removing the tests from intest7.F90, and we get that merged,
  2. then, we merge in your #409 which has the more thorough testing for strnum.F90

Please let me know if that works for you, or else if you still have some remaining concerns?

jbathegit commented 1 year ago

Hey Ed, I think I know why we're seeing the "STOP 300" now for the misc_8 tests in the MacOS and developer runs. In fact, in thinking about it some more, I'm surprised we're not seeing it for the other runs as well.

Simply put, not all of the routines in the library have an internal wrapper in them to seamlessly handle any type of integer. Jack and I only put these wrappers into the routines which are expected to be called by users from application codes, and nemtbax() isn't one of those. In other words, nemtbax() is only expected to ever be called internally from another routine (and where everything is always guaranteed to be 4-byte integers since that's the only way we build the library now ;-), so I'm not surprised that it's failing with an unexpected return value for inod when you now try to call it directly in the misc_8 test using 8-byte integers.

I think the way to fix this going forward is that you should only ever do the test_bort_8 tests for routines which have the aforementioned wrappers in them. Basically, the wrapper will look something like this in the routine if it's there:

C  CHECK FOR I8 INTEGERS
C  ---------------------

      IF(IM8B) THEN
         IM8B=.FALSE.

         (zero or more calls to X84)
         (recursive call to the same routine)
         (zero or more calls to X48)

         IM8B=.TRUE.
         RETURN
      ENDIF

And if it's not there, then you should only ever run the test_bort_4 case for that particular routine.

I think this may(?) also explain at least some of the SIGSEGV and SIGFPE errors that are also popping up in all of the runs (even the runs that are "passing" with the green check mark!). These are indeed "failing" inside of test_bort.F90 code; however, they're not failing like we want them to (by calling bort()) but rather because they're either hitting a SIGSEGV or SIGFPE. You have to dig a bit to find them, but they are there. We should probably chase all of those down as well, so that they fail the way we really want them to (i.e. by calling bort() ;-)

edwardhartnett commented 1 year ago

OK, that makes sense about the _8.

I will deal with that in this PR, stand by...

edwardhartnett commented 1 year ago

OK, if this looks OK, let's get it merged...

jbathegit commented 1 year ago

Hey Ed, this looks better, but I'm still seeing SIGSEGV and SIGFPE errors in the test output. I also noticed at least one address sanitizer fault in the developer run, which I'm guessing only "passed" b/c we want it to fail. But again, we actually want it to fail inside of bort, and not because of a SIGSEGV, SIGFPE, or a memory violation. I'm sure you're probably tired of working on this PR, but I really think these need to be cleaned up before we merge it.

Also, as a reminder, please don't run _8 tests on any routines which have at least one integer argument and which don't have the aforementioned "Check for I8 integers" wrapper inside of them. I see where you now have an ifdef KIND_4 switch in your test_misc code, but I think you also need one in your test_bort code as well; specifically around routines such as adn30, idn30, nemtba, nemtbb, nemtbd, numbck, pkb, pkb8, etc. which also don't have those 8-byte wrappers inside of them.

Again, the reason those routines don't have those 8-byte wrappers is because they aren't really intended to ever be called directly from an application code. Instead, they're expected to only ever be called internally inside of the library, and in which case they aren't needed because the library itself is now only built using 4-byte integers. So if we do want to test any of those routines directly, then we should only ever do so under the KIND_4 switch.

edwardhartnett commented 1 year ago

OK, let me track down those segfaults. I'll have another attempt for tomorrow morning. ;-)

edwardhartnett commented 1 year ago

OK, I have dealt with all segfaults.

To deal with the problems of testing functions that were not written to be directly called when linked to the _8 version of the library, I have reduced the testing to the _4 and _d versions. This will allow us to test the borts without worrying about the _8 calls for functions that don't cope with _8.

jbathegit commented 1 year ago

Good work!