NOAA-EMC / NCEPLIBS-bufr

The NCEPLIBS-bufr library contains routines and utilites for working with the WMO BUFR format.
Other
42 stars 19 forks source link

more testing #535

Closed jbathegit closed 8 months ago

jbathegit commented 9 months ago

Part of #445

Note that the additional debufr test required a change to the source code for the utility, to ensure that allocated memory was freed following the execution of the newly-tested code. Otherwise, the address sanitizer would complain about memory leakage and segfault during the CI tests.

jbathegit commented 8 months ago

Now all of a sudden the MacOS runner is failing with a missing python module:

Capture

Just in case this was some temporary glitch, I tried rerunning it several times and on different days, but to no avail.

@AlexanderRichert-NOAA @aerorahul @climbfuji @edwardhartnett any suggestions?

jbathegit commented 8 months ago

BTW, I realize I could just change the -DENABLE_PYTHON=ON flag to -DENABLE_PYTHON=OFF in the cmake step of the build-bufr job in the MacOS.yml file, but I'm assuming somebody at some point had a good reason for enabling python testing on this platform(?)

Currently, we have python testing enabled in the develop, Linux, and MacOS runners, but it's currently disabled in the Intel runner.

jbathegit commented 8 months ago

And now all of a sudden the MacOs build is working again!?

jbathegit commented 8 months ago

In this latest commit I added more bort testing for nemtbb() and nemtbd(). But I'm struggling to write test cases for several of the bort options in those routines using dummied-up DX tables containing obvious errors. For example:

In summary, it seems to me we could just get rid of a number of these bort tests in the nemtbb() and nemtbd() routines, because unless I'm missing something they don't really seem to be serving any practical purpose. @jack-woollen please chime in if you have any thoughts on this - thanks!

jbathegit commented 8 months ago

Thanks @jack-woollen, but before I merge this PR (and which in turn will close it :-) do you have any thoughts on my above suggestions to eliminate some of those bort tests in nemtbb() and nemtbd(). Do you agree with my thought process, or can you think of something I'm missing in my analysis?

jack-woollen commented 8 months ago

Jeff, a few thoughts. First, it's not impossible for those bort conditions to happen, its just impossible to test them. Because the only way the impossible conditions are possible is all the rules go out the window such as memory clobbering due to bad inputs or bad coding, or some other bad things. I think these types of tests are worthwhile and do not at all subscribe to the idea that they are useless. The only reason I think is important enough to consider maybe getting rid of some of them would be if they cost significant amounts of time. I was in the process of starting to check that out when dogwood switched to production. I'll pick it up again later today. It may be useful to classify the bort statements into those that check on function and those that check on dysfunction, to see the ratio, and calculate the actual costs of each type.

edwardhartnett commented 8 months ago

If code cannot actually be reached, then it should be removed. Nor can we try to capture errors caused my random memory overwrites or other totally unpredictable conditions.

What often happens in a testing campaign is that you realize some code can never be reached. So remove that code!

Nothing should stay in the code just because "it does no harm." Every line of code costs money and time to write and more importantly maintain. So if the line of code is not doing anything useful, remove it, and cut code size and maintenance costs.

Consider this: right now @jbathegit has figured out that these lines can't be reached, and raised questions about them, and that's caused us all to take some time to look at the problem. That's not free. This code is already costing NOAA extra time and money. So let's resolve the problem - if the lines are not needed, remove them.

Then, in 5 years, when this issue is long forgotten, NOAA won't incur the same costs by some programmer wondering if those lines of code could ever be reached and raising the whole issue all over again. Which will once again cause expense for NOAA for lines of code that we don't need.

jbathegit commented 8 months ago

After giving this some more thought, I'm leaning more towards Ed's way of thinking about this. I hear what you're saying Jack, but I agree with Ed that we shouldn't retain code that we can't test, and that all bets are off anyway if something unpredictable happens.

So, in the near future, I'll push up another commit with those unreachable bort tests removed.

jbathegit commented 8 months ago

I've made the aforementioned nemtbb and nemtbd changes to remove those untestable bort cases, and I've also merged in the changes from #539. But now I'm once again seeing CI errors for MacOS (with a different Python build error) and Intel (with cmake now apparently trying to build the oneAPI environment).

@AlexanderRichert-NOAA not sure if this is related in any way to what you're working on in #536, but obviously none of that has been merged into this branch yet, so I'm a bit stumped.