WIP: support massive input sizes

tomdeakin commented 2 years ago

As main memory sizes increase, we are seeing errors for very large input sizes, passed in via the command line argument --arraysize. This reads in an int which can store values up to 2,147,483,647 (2^31 - 1).

First, the main.cpp driver code is updated to parse and store the input as a intptr_t. This gives us a range up to all of memory.

Secondly, the OpenMP implementation is updated to:

Store the array size as a intptr_t inside the class (note implementations own their own data and array size).
Update the loop bounds of all kernels to use intptr_t indexing. Note we want to keep using a signed data type because it makes vectorisation easier because the loop is countable (e.g. MSVC, Intel)

Finally, the starting literal values need the long double suffix: e.g. 0.2L.

This PR currently updates only the OpenMP code. This change needs propagating to the other models.

Progress

[ ] OpenACC
[ ] CUDA
[ ] HIP
[ ] Java ???
[ ] Julia ???
[ ] Kokkos
[ ] OpenCL
[X] OpenMP
[ ] Raja
[ ] Rust ???
[ ] Scala
[ ] std-data (C++ stdpar)
[ ] std-indices (C++ stdpar)
[ ] std-ranges (C++ stdpar)
[ ] SYCL
[ ] SYCL 2020
[ ] TBB
[ ] Thrust

tom91136 commented 2 years ago

@tomdeakin I wonder whether we should template/macro the induction or not and make it a runtime option (e.g --counter=uint32_t|uint64_t|intptr_t ). It's quite possible changing the loop induction's type will alter how it's optimised (positively or negatively) so leaving it as an option allows us to compare with results from before this change.

tomdeakin commented 2 years ago

We definitely don't want to offer unsigned types here. We need signed types to help with vectorisation, and don't want to suggest that using unsigned is best practice. Given that, we have limited choices of the type we should use.

intptr_t
int64_t
long long

intptr_t guarantees we can hold a pointer to any place in memory, so the pointer arithmetic should always be valid.

tomdeakin commented 9 months ago

Resolve this for v6.0 - I'm to 100% this is the right approach to solve this with intptr_t. We need to review other approaches to this (e.g., Kokkos, C++)

tomdeakin commented 1 month ago

Closing as efforts are superseded by the approach in #188

UoB-HPC / BabelStream

WIP: support massive input sizes #127

Progress