UoB-HPC / BabelStream

STREAM, for lots of devices written in many programming models
Other
313 stars 109 forks source link

Fortran ports #135

Closed jeffhammond closed 1 year ago

jeffhammond commented 1 year ago

This is a new implementation of BabelStream using Fortran.

The code uses a Fortran driver that is largely equivalent to the C++ one, with a few exceptions. First, it does not use a C++ class for the stream object, since that doesn't seem like a useful way to do things in Fortran. Instead, I use a module that contains the same methods, and which has alloc and dealloc that act like CTOR and DTOR.

The current implementations are:

I have tested with GCC, Intel (ifort and ifx), and NVHPC compilers on AArch64, x86_64 and NVIDIA GPU targets, although not exhaustively.

The current build system is GNU Make, and requires the user to manually specify the compiler and implementation. I have not, and will not, do anything related to CMake.

~The only thing missing now is CSV output.~ CSV printing is supported.

jefflarkin commented 1 year ago

Comment to the reviewers: For the array version, would you consider adding !$acc kernels around the sections of array syntax? In theory, it would also be valid to put an !$omp workshare around them for OpenMP compilers. I'm not sure whether any compilers will auto-parallelize, much less auto-offload, the array syntax, but a variety of compilers will happily support one or both of those parallelization hints.

tomdeakin commented 1 year ago

@jefflarkin - definitely worth exploring what happens in both of those cases for sure!

jeffhammond commented 1 year ago

@jefflarkin See OpenACCArrayStream.F90

jeffhammond commented 1 year ago

OpenMPWorkshareStream.F90 is there too now, but it should not offload, because OpenMP does not have omp target workshare yet, as far as I know. It should, but someone has to argue with the workshare haters on the committee.

jeffhammond commented 1 year ago

I removed the IEEE NaN check from this branch, because I did it wrong. It is now fixed, and there are other improvements in other branches, which I'll merge after the paper is accepted. I don't want to break the branch linked in the paper for obvious reasons.

I've now implemented CSV printing, while REAL32 and INT32 options both seem to be working now too.

jeffhammond commented 1 year ago

This is ready for review. I merged in all the good stuff done post-submission.