Closed MarcelKoch closed 1 month ago
Inlining happens much later than any checks for function attributes, so this should not be an issue at all. If zero
and one
are constexpr
, they can be called from device code. If the function they themselves are calling is not callable from device code, the compiler should complain.
but even if the function is __device__ __host__
but not constexpr, the zero/one function should behave like normal function in __host__
because of no attribute.
no matter if function is device callable or not, device kernel should not be able to call zero/one in this case
The one
and zero
functions are already marked constexpr
and we can't remove it, since that would be an interface break. So the only question is if __device__ __host__
is necessary, which I argue it is not due to cuda's compiler option. AFAIK, it doesn't matter if the constexpr
function is actually evaluated at compile time.
If these functions can't be constexpr
for gko::half
, or __half
, then this needs to be addressed anyway, but not in this PR.
I know constexpr does not always need to be evaluated in compile time.
when constexpr is failed, I thought __device__ __host__
will be necessary.
So the original one will work no matter if __half
is constexpr construable or not.
it sounds like just stronger inline in some sense.
To make the terminology a bit clearer: A function is marked constexpr
if it has the attribute next to its name - that's what nvcc
cares about. A function is constexpr
if it is marked constexpr
and can be evaluated at compile-time. For us, only the former matters right now.
Issues
16 New issues
0 Accepted issues
Measures
0 Security Hotspots
80.0% Coverage on New Code
0.0% Duplication on New Code
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 89.96%. Comparing base (
532566d
) to head (98f29f6
). Report is 3 commits behind head on develop.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This PR removes
GKO_ATTRIBUTES
from constexpr functions in math.hpp. Since we build cuda with--expt-relaxed-constexpr
theconstexpr
is already enough to allow those functions on the device.Fixes #1693.