Generate C++ bindings using templates

Here's one approach that should work -- we can use the m4 generators to generate specializations for the standard C types and add a generic templated version that drops to putmem to handle any types not already covered by the standard C types. Note that for e.g. shmem_p dropping into shmem_putmem for such a type may adversely affect performance. However, this case will not happen on LP64 ABI since the standard C types cover all types in the SHMEM RMA and AMO APIs.

This would look something like:

template <typename T>
inline void shmem_put(T* dest, const T* source,
                      size_t nelems, int pe) {
    shmem_putmem(dest, source, nelems*sizeof(T), pe);
}

template <>
inline void shmem_put<int>(int *dest, const int *source, 
                           size_t nelems, int pe) {
    shmem_int_put(dest, source, nelems, pe);
}

While this will work for RMA operations, there is no generic "mem" version of the AMOs that we can use to cover gaps between the standard types and AMO types.

Sandia-OpenSHMEM / SOS

Generate C++ bindings using templates #583