Open psiha opened 2 years ago
The idea is sound as we discussed during the conference, how tobkeep those info as IIRC typzdef and attributes dont mix well.
The idea is sound as we discussed during the conference, how tobkeep those info as IIRC typzdef and attributes dont mix well.
You can use 'decorating pass-through' functions instead of typedefs (a la std::assume_aligned):
template < int alignment >
struct aligned
{
template < typename T >
static constexpr auto get( T * const ptr ) noexcept { return static_cast< T * >( __builtin_assume_aligned( ptr, alignment ) ); }
};
struct restricted
{
template < typename T >
static constexpr auto get( T * __restrict const ptr ) noexcept { return ptr; }
};
struct aliased
{
template < typename T >
static constexpr auto get( T * const ptr ) noexcept
{
using aliased_T = T __attribute__(( __may_alias__ ));
return reinterpret_cast< aliased_T * >( ptr );
}
};
and then in the view class template, for example, never access the pointer directly but always through a getter that first passes the pointer through the 'attribute decorators'...
The decorators could be specified as part of the options variadic pack:
view<float, extent()(), aligned<64>, restricted >
or as wrappers of the value type:
view<restricted<aligned<float, 64>>, extent()()>
I was thinking maybe use the decorators directly when passing the pointer to the view and let the deduction guide do its job. We're moving toward a more DG, less ugly tempalte API at the moment so:
view v{ restricted{ptr} };
view w{ restricted{aligned<64>{ptr}} };
could be a thing. I can then just store the decorated pointer into the internal view_ptr.
I was thinking maybe use the decorators directly when passing the pointer to the view and let the deduction guide do its job. We're moving toward a more DG, less ugly tempalte API at the moment so:
DG?
view v{ restricted{ptr} }; view w{ restricted{aligned<64>{ptr}} };
But the decorators need to be part of the view type?
could be a thing. I can then just store the decorated pointer into the internal view_ptr.
AFAIK that can't be made to work with std alignas because it does not work on typedefs or pointers (i.e. the pointed to type)...maybe there is a trick with compiler specific attribute(( aligned() )) __declspec( align() )...
I was thinking maybe use the decorators directly when passing the pointer to the view and let the deduction guide do its job. We're moving toward a more DG, less ugly tempalte API at the moment so:
DG?
deduction guides
view v{ restricted{ptr} }; view w{ restricted{aligned<64>{ptr}} };
But the decorators need to be part of the type?
They'll be as the deduction guide will leverage the type of the constructor parameters
could be a thing. I can then just store the decorated pointer into the internal view_ptr.
AFAIK that can't be made to work with std alignas because it does not work on typedefs or pointers (i.e. the pointed to type)...maybe there is a trick with compiler specific attribute(( aligned() )) __declspec( align() )...
I was thinking of storing an instance of restricted
or w/e decorator and use their get() to access the pointer.
Cool...something to show off with to the ML python engineers 😁
Just a 'tiny' leftover - restricted (pardon the pun) automatic conversion of view types (only) in the less constrained direction:
ps. personally i have no need for the may-alias-anything views (and probably very few people do), i put it out there just for completeness sake..
I was wondering about the conversion indeed. Good ideas.
As for restrict, it looks like on MSVC I need also the get() to be marked __decltype(restrict) or the restrict-ness don't get passed through. Can you confirm ?
Since it is inlined it should not be needed but hey it's MSVC so ... 🤦 😅 ... best to simply slap [[ gnu::returns_nonnull, gnu::malloc, msvc::restrict ]] on it :)
heads up 'bout the other ugly corporation - apple's stl still does not have std::assume_aligned yet msvc does oddly support __builtin_assume_aligned so go with that for the time being...
The whole restrict enchilada will ends up behind a macro I guess
The whole restrict enchilada will ends up behind a macro I guess
should be no need for that, even if for msvc 'inline is not just like a macro' (like it is for gcc and clang) the attribute will do the trick...i do it that way...
except msvc can't handle [[ msvc::restrict ]] ... you have to ifdef it for msvc and use the declspec syntax
also if by some chance you aren't yet sorry that i stumbled on in here 😅 - the aligned ptr attribute only handles/shaves of the loop prologue - to shave off the loop tail (for the less-than-simd or less-than-unroll part) we need a way to constrain the minor dimension of the extent:
well, for SIMD computation, we use a different process now. See EVE talk, we always load full register and mask the last one : https://www.youtube.com/watch?v=WZGNCPBMInI
oh and if you prefer a more direct way to interact, we have a Discord: https://discord.gg/8A4Q4HkhcW
Are you sure about the benefit of restrict ? Some random test: https://godbolt.org/z/W34s3Mf4o shows that it inhibit autovectorization and simpler example just plain don't work on clang/MSVC.
Do you happen to have a repeatable scenario where restrict is shown to be better ?
😳 lemme investigate 😳
ok this is truly uber weird, you really stepped on it here 😳 let's work with this version: https://godbolt.org/z/5bx7KPGjG
fwiw clang bug report https://github.com/llvm/llvm-project/issues/54193
Reported codegen issue to MSVC bug tracker
UPVOTE FOR THE UPVOTE GOD OR MSVC WONT FIX: https://developercommunity.visualstudio.com/t/Incorrect-behavior-when-using-__restrict/1683917#T-N1688184