ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
309 stars 313 forks source link

Add tests for derecho_intel-oneapi and transition over from intel testing #2476

Open ekluzek opened 7 months ago

ekluzek commented 7 months ago

In the SEA ISS meeting today, we learned that the standard intel compiler is now deprecated and will be removed in 2024-Q4. Most likely we will still be able to use intel for quite some time after that. But, this also means that any compiler bugs we find in intel, won't be fixed.

Other reasons to move to intel-oneapi is that it allows using GPU's, it's future protection for compiler changes, AND intel-oneapi is based on LLVM which adds a host of compiler robustness and language understanding to the compiler. It also has OpenMP-5 which allows GPU offloading using OpenMP directives. Since we already have OpenMP in CTSM this would be a natural way for us to support GPU offloading in CTSM.

In #1995 we mention intel-onapi and that we had problems. We need to bring look at that again and see what's happening. Now, that Derecho is more stable it's likely working better.

Definition of done first three of these are important for CTSM5.3 timeline:

ekluzek commented 7 months ago

@jedwards4b told CSEG about this on March 5th. Here's his email about this...

Sometime in the coming months intel will be moving from the ifort compiler to the ifx compiler and dropping the ifort compiler. We already have access to the ifx compiler in the intel-oneapi compiler definition on derecho. I suggest that each component model add some tests using intel-oneapi to their test list.

ekluzek commented 7 months ago

Note this discussion

https://github.com/ESCOMP/CTSM/issues/1733#issuecomment-2059581719

where we note that intel-oneapi was not working on NVIDIA GPU's. And it's likely that nvhpc is just going to have better performance for Derecho GPU's.

ekluzek commented 6 months ago

In discussion on this in CSEG there is agreement that we should start moving over. But, not the highest priority. As stated with here...

Brian’s feeling is: if component SEs have time to play with this now, go for it. But it’s not high priority yet. Note that ifx is evolving quickly.

It would be good though to try our entire testlist with intel-oneapi and just document what breaks.

ekluzek commented 5 days ago

In the CSEG meeting this was brought up again as we will want to be doing this change CESM wide within the next month of time or so.

We don't currently have any intel-oneapi tests so we really need to start exploring this.