Fix return type for atomicAdd with unsigned long long operands

ROCm / HIP-CPU

An implementation of HIP that works on CPUs, across OSes.

MIT License

107 stars 19 forks source link

Fix return type for atomicAdd with unsigned long long operands #52

Closed fwyzard closed 3 months ago

fwyzard commented 6 months ago

Fix the return type for

atomicAdd(unsigned long long* address, unsigned long long val)

from std::uint64 to unsigned long long.

This overload is enabled only if std::uint64 and unsigned long long are different types, so the return type should be unsigned long long.

fwyzard commented 6 months ago

OTOH, is the template-based check really needed ? If unsigned long long and std::uint64 are the same type, the second inline function would simply be an identical redefinition of the first one, which should not be an issue ?

inline
std::uint64_t atomicAdd(std::uint64_t* address, std::uint64_t val) noexcept
{
    return hip::detail::atomic_add(address, val);
}

inline
unsigned long long atomicAdd(unsigned long long* address, unsigned long long val) noexcept
{
    return hip::detail::atomic_add(address, val);
}

fwyzard commented 6 months ago

In fact, the latter seems a more robust solution, because it supports the cases where val is not an unsigned long long but can be converted to it - while the current code and version in the PR does not.

AlexVlx commented 4 months ago

This was discussed in the past, and the conclusion is that I made a mistake using the sized integer types here, because it introduces divergence versus the HIP / CUDA interfaces. Hence, we probably want to just remove uses of the sized integer typedefs and match the original HIP interface.

AlexVlx commented 4 months ago

This is the prior discussion on the topic: #50.

fwyzard commented 4 months ago

I see.

Do you prefer a PR that replaces all std::int.../std::uint... style types with the plain int/unsigned int etc. ones ?

AlexVlx commented 4 months ago

I see.

Do you prefer a PR that replaces all std::int.../std::uint... style types with the plain int/unsigned int etc. ones ?

As far as preferences go, that would be my choice, however it's more work for you. So if you're feeling particularly generous, it'd be welcome, if not we can just merge this as is and revisit/cleanup later:)

fwyzard commented 4 months ago

No problem - just one question:

The ROCm/HIP headers have, for example

int atomicAdd(int* address, int val);
unsigned int atomicAdd(unsigned int* address, unsigned int val);
unsigned long atomicAdd(unsigned long* address, unsigned long val);
unsigned long long atomicAdd(unsigned long long* address, unsigned long long val);
float atomicAdd(float* address, float val);
double atomicAdd(double* address, double val);

Should I put the unsigned long case as well ?

fwyzard commented 4 months ago

I've tried to implement:

all cases that were there before (e.g. atomicCAS with short)
all cases supported by HIP/ROCm, except for some floating point ones (e.g. atomicCAS with float or double arguments is not implemented)
all cases supported by CUDA, even if they are not in HIP/ROCm (e.g. atomicAnd with long long arguments)

Differences with respect to HIP/ROCm are marked with a FIXME.

AlexVlx commented 3 months ago

@fwyzard apologies for taking a bit long on this one. The change looks good in principle but the CI is showing that it's tickling Windows in the wrong way, let me figure out why so that we can fix it.

fwyzard commented 3 months ago

OK, sure. Unfortunately I cannot help with that, as I don't have a Windows machine.