ROCm / hipamd

35 stars 37 forks source link

Vector types operators taking a second arbitrary argument are too greedy #4

Open Oblomov opened 2 years ago

Oblomov commented 2 years ago

This is again an issue that came up trying to introduce a HIP backend support in GPUSPH. We have Point and Vector classes that can be constructed from vector types, equipped with operators that allow adding (etc) them together.

This means that code such as this: double3 v ; Point pt ; Point n = v + pt works because the v gets converted to a Point and Point+Point is a defined operation —at least in CUDA and on CPU.

This is however does not work when using HIP vector types, because of definitions such as:

    template<typename T, unsigned int n, typename U>
    __HOST_DEVICE__
    inline
    constexpr
    HIP_vector_type<T, n> operator+(
        const HIP_vector_type<T, n>& x, U y) noexcept
    {
        return HIP_vector_type<T, n>{x} += HIP_vector_type<T, n>{y};
    }

or the one with the flipped arguments. These take precedence of the conversion-enabled Point+Point operator, and result in an error because HIP_vector_type<T, n> cannot be constructed from a Point.

The solution is to enable these operators only for types for which the conversion is defined, for example using appropriate enable_if fencing:

    template<typename T, unsigned int n, typename U>
    __HOST_DEVICE__
    inline
    constexpr
    typename std::enable_if<std::is_convertible<U, HIP_vector_type<T, n>>::value, HIP_vector_type<T, n>>::type
    operator+(
        const HIP_vector_type<T, n>& x, U y) noexcept
    {
        return HIP_vector_type<T, n>{x} += HIP_vector_type<T, n>{y};
    }

This should be done for all these operators. Additionally, the U argument should be a const& to avoid unnecessary invocations of copy constructors.