NOAA-EMC / NCEPLIBS-bufr

The NCEPLIBS-bufr library contains routines and utilites for working with the WMO BUFR format.
Other
42 stars 19 forks source link

Add IntelLLVM support #536

Closed AlexanderRichert-NOAA closed 3 weeks ago

AlexanderRichert-NOAA commented 8 months ago

This PR adds IntelLLVM (OneAPI) support, including a test workflow. No further changes to Intel flags, etc. were needed except that one single unit test (test_c_interface_2) was segfaulting, and was resolved by adding some explicit "int" conversions in the c2f interface.

Fixes #538

edwardhartnett commented 8 months ago

Can you add an issue for this please, assigned to the next release? And add it to the release notes for the next release...

AlexanderRichert-NOAA commented 8 months ago

@jbathegit @jack-woollen I'm going to see if I can also take care of the MacOS/Python issue in this PR, but meanwhile-- In order to fix segfaults in one of the CI tests for Intel OneAPI (all the others were fine out of the gate), I had to modify src/bufr_c2f_interface.F90. Specifically, it for some reason required explicitly converting some of the c_int-type intent=in params to regular fortran int's for a number of the subroutine calls. I checked and there was no size difference between the two types (both gave SIZEOF()=4), that's as far as my investigation went though. I'm not sure if this reflects a compiler bug or an issue with the code, but in any case let me know if this looks sensible.

Edit: I'll do the python/MacOS issue separately, as it appears that will be a bit of a rabbit hole.

jbathegit commented 8 months ago

@AlexanderRichert-NOAA I really appreciate you working on this, since we are going to need to be able to support and test the library under the newer Intel compilers in the very near future! #539 has now been merged as you requested, but now the Intel tests are failing.

On a related note, I don't really understand why those int() wrappers should be needed in bufr_c2f_interface.F90, especially since:

  1. You apparently didn't also need them to wrap similar c_int value arguments for calls to iupb (see iupb_f) or istdesc (see istdesc_f) within the same source file. Those particular functions aren't called by test_c_interface_2, but they are called from other C codes in the library (i.e. from utils/xbfmg.c and src/restd.c, respectively), so if those wrappers are really needed then why didn't those codes similarly segfault in test/test_scripts/test_xbfmg.sh, etc.?
  2. It also doesn't make sense to me why you'd also need an int() wrapper around the argument to ibfms (see ibfms_f), because that argument is supposed to be a c_double, so I would think that downgrading that argument to an int would completely muck up that interface right from the get-go.

Or am I missing something here?

jbathegit commented 8 months ago

Also, and since this PR introduces separate new CI tests for Intel / Intel (CC=icx FC=ifx) and Intel / Intel (CC=icc FC=ifort), then why do we still have the separate old CI test for Intel, and which apparently is still stuck waiting to run?

AlexanderRichert-NOAA commented 8 months ago

I got the Intel tests working, but the MacOS one is sporadically failing so I'm working on that (it appears to be running some steps multiple times which is... very confusing).

I'll have to look into the int() stuff some more, maybe see if it's some kind of issue with the compiler. I may try to come up with a test case and run it by the Intel OneAPI developers. As for question 2, I may have used the wrong function there so I'll look at that in the process.

AlexanderRichert-NOAA commented 8 months ago

The "Intel" test is showing up in the list there because it's a required test therefore it's "expecting" it to run even though it's been removed from the CI. Once this PR is good to go, we will just need to go into the branch protection settings, remove the old test, and require the new Intel tests instead.

jbathegit commented 8 months ago

Hey @AlexanderRichert-NOAA, I can clearly see how much effort you've been putting into trying to get this resolved. Please know that it's much appreciated, and please let me know if I can be of any further help!

AlexanderRichert-NOAA commented 7 months ago

@jbathegit the issue with the C bindings/data sizes is a bug in the OneAPI Fortran compiler. The Intel folks weren't able to give a timeframe on a fix, so I'm leaning toward closing this for now and circling back when there's a working version of the compiler. On the other hand, if you think it would be useful to have this support sooner, or for that matter if we needed to transition to OneAPI before the compiler is fixed, then I can always restructure things in order to have minimal impact on the code as far as the other compilers (i.e., use a preprocessor def to only apply a fix when compiling with Intel OneAPI).

No word yet from the f2py developers about that bug, so for now we just have to avoid Python 3.12.

jbathegit commented 7 months ago

Thanks Alex, and yes I'm fine with closing this PR and then just re-opening a new one to address #538 once we have a working version of the compiler.

AlexanderRichert-NOAA commented 7 months ago

Closing while waiting for OneAPI and f2py fixes

AlexanderRichert-NOAA commented 1 month ago

@jbathegit @edwardhartnett now that OneAPI 2024.2 is out, this PR is good to go on my end. If/when it looks good, please squash-and-merge so we don't put my 10 million debugging commits into develop :)

jbathegit commented 3 weeks ago

This looks fine @AlexanderRichert-NOAA, and thanks again for working on this!

Per the earlier discussion (see Nov 22, 2023 above) I was able to go in and remove the old "Intel" test from the develop branch protection settings, but for whatever reason I don't see how to now add "Intel / Intel (CC=icx FC=ifx)" and "Intel / Intel (CC=icc FC=ifort)" as new required tests. I tried doing a search on "Intel" in the status checks search bar to try and add them in that way, but for whatever reason it keeps popping me back out to the branch protection rules main page when I try to do that(?)

Either way, I can't merge this PR until this is resolved, because otherwise the merging is blocked. If I'm missing something here please let me know - thanks!

jbathegit commented 3 weeks ago

OK once the system refreshed (after I removed "Intel" as a required status check), I could now do the squash+merge for this PR. However, I think we should still have the two new Intel checks as "Required" rather than just b/c we happen to have an Intel.yml in the repository under .github/workflows. But I still can't figure out how to do that - here's what I see on that page:

Capture

So how do I now add the two new "Intel / Intel (CC=icx FC=ifx)" and "Intel / Intel (CC=icc FC=ifort)" as "Required" checks?

AlexanderRichert-NOAA commented 3 weeks ago

You should be able to search for them in the search box there, click on the test(s) you want to require, and hit save at the bottom of the page. If the tests aren't showing up let me know and I can take a crack at it.

jbathegit commented 3 weeks ago

Hmm, OK that's exactly what I was trying to do before (i.e. use the search box for "Intel"), but whenever I did that it kept popping me back out to the main page. But now it's working, so maybe I just had to wait for whatever system refresh also removed the earlier merge block(?)

Anyhoo, all good now - and thanks again!