Closed AlexanderSinn closed 3 weeks ago
Should we have extern template class IntVectND<AMREX_SPACEDIM>
, which might help reduce compile time?
Wouldn't that prevent the member functions from getting inlined, which could be bad when IntVect is used in a parallel for?
My understanding is it should not affect run time performance. But I might be wrong.
It is also not clear how GPU compilers treat it. So probably a bad idea and let's not worry about it.
On Tue, Jun 11, 2024, 10:05 AM Alexander Sinn @.***> wrote:
Wouldn't that prevent the member functions from getting inlined, which could be bad when IntVect is used in a parallel for?
— Reply to this email directly, view it on GitHub https://github.com/AMReX-Codes/amrex/pull/3969#issuecomment-2161235829, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB37TYKFM7RBL3YJMPHJYYTZG4U4VAVCNFSM6AAAAABIUM7VYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRRGIZTKOBSHE . You are receiving this because your review was requested.Message ID: @.***>
I tried to add extern template class IntVectND<AMREX_SPACEDIM>;
however, without adding template class IntVectND<AMREX_SPACEDIM>;
to the cpp file. I expected to get a linker error however it compiled just fine.
Maybe it's because all those functions have already been inlined.
diff --git a/Src/Base/AMReX_IntVect.H b/Src/Base/AMReX_IntVect.H
index 70a12811a2..7c472742ae 100644
--- a/Src/Base/AMReX_IntVect.H
+++ b/Src/Base/AMReX_IntVect.H
@@ -710,6 +710,9 @@ public:
return IntVectND<dim>(std::numeric_limits<int>::lowest());
}
+ static const IntVectND<dim> Zero;
+ static const IntVectND<dim> Unit;
+
AMREX_GPU_HOST_DEVICE AMREX_FORCE_INLINE
static constexpr std::size_t size () noexcept {
return static_cast<std::size_t>(dim);
@@ -767,6 +770,13 @@ private:
int vect[dim] = {};
};
+
+template <>
+inline constexpr IntVectND<AMREX_SPACEDIM> IntVectND<AMREX_SPACEDIM>::Zero{0};
+
+template <>
+inline constexpr IntVectND<AMREX_SPACEDIM> IntVectND<AMREX_SPACEDIM>::Unit{1};
+
// Template deduction guide for IntVectND
template<std::size_t dim>
AMREX_GPU_HOST_DEVICE // __device__ for HIP
This seems to work. So we could get IntVect::Zero and IntVect::Unit back for backward compatibility.
Unfortunately template <int dim> inline constexpr IntVectND<dim> IntVectND<dim>::Zero{0}
might only work for gcc. https://stackoverflow.com/questions/72409091/define-a-static-constexpr-member-of-same-type-of-a-template-class
Unfortunately, nvcc does not work.
Maybe instead of being IntVectND, Zero and Unit could have a different type that is implicitly convertible to IntVectND. This would allow them to be static constexpr
.
That may not work in cases like
IntVect::Zero + sign(a_side) * 2 * BASISV(idir);
This is copied from a real code.
Maybe we can try to figure out how to make nvcc happy. The error message is
/home/runner/work/amrex/amrex/Src/Base/AMReX_IntVect.H:775:34: error: conflicting declaration ‘amrex::IntVectND<3> amrex::IntVectND<3>::Zero’
775 | inline constexpr IntVectND<AMREX_SPACEDIM> IntVectND<AMREX_SPACEDIM>::Zero{0};
| ^~~~~~~~~~~~~
/home/runner/work/amrex/amrex/Src/Base/AMReX_IntVect.H:713:24: note: previous declaration as ‘const amrex::IntVectND<3> amrex::IntVectND<3>::Zero’
713 | static const IntVectND<dim> Zero;
| ^~~~
Maybe it wants constexpr const
?
It seems to work if both are just const with no constexpr.
Great. Maybe it will even work for all <int dim>
.
constexpr const actually works somehow
@ax3l This PR will break pyamrex. How do you want to handle this? Should we wait till pyamrex has a PR ready?
Thanks for the ping. That is all we need to track.
Please go ahead here to merge and we update pyAMReX shortly after.
pyAMReX update coming via https://github.com/AMReX-Codes/pyamrex/pull/332 :)
Summary
As described in #3955, this PR converts IntVect to the n dimensional IntVectND and adds
using IntVect = IntVectND<AMREX_SPACEDIM>;
. Additionally, deduction guides, support for structured bindings and the helper functionsIntVectCat
,IntVectSplit
andIntVectResize
were added to IntVectND.I didn't find a way to make
IntVect::Zero
andIntVect::Unit
work with n dimensions, so I removed them (converting them tostatic constexpr
orinline static
doesn't work with MSVC). Instead,IntVect::TheZeroVector()
andIntVect::TheUnitVector()
can be used.Additional background
Using structured binding support of
amrex::GpuTuple
, the following code should work (on GPU):Checklist
The proposed changes: