jfalcou / kiwaku

C++20 and onward collection of high performance data containers and related tools
https://jfalcou.github.io/kiwaku
Boost Software License 1.0
51 stars 5 forks source link

View pointer attributes #7

Open psiha opened 2 years ago

psiha commented 2 years ago
jfalcou commented 2 years ago

The idea is sound as we discussed during the conference, how tobkeep those info as IIRC typzdef and attributes dont mix well.

psiha commented 2 years ago

The idea is sound as we discussed during the conference, how tobkeep those info as IIRC typzdef and attributes dont mix well.

You can use 'decorating pass-through' functions instead of typedefs (a la std::assume_aligned):

template < int alignment >
struct aligned
{
    template < typename T >
    static constexpr auto get( T * const ptr ) noexcept { return static_cast< T * >( __builtin_assume_aligned( ptr, alignment ) ); }
};

struct restricted
{
    template < typename T >
    static constexpr auto get( T * __restrict const ptr ) noexcept { return ptr; }
};

struct aliased
{
    template < typename T >
    static constexpr auto get( T * const ptr ) noexcept
    {
        using aliased_T = T __attribute__(( __may_alias__ ));
        return reinterpret_cast< aliased_T * >( ptr );
    }
};

and then in the view class template, for example, never access the pointer directly but always through a getter that first passes the pointer through the 'attribute decorators'...

The decorators could be specified as part of the options variadic pack: view<float, extent()(), aligned<64>, restricted > or as wrappers of the value type: view<restricted<aligned<float, 64>>, extent()()>

jfalcou commented 2 years ago

I was thinking maybe use the decorators directly when passing the pointer to the view and let the deduction guide do its job. We're moving toward a more DG, less ugly tempalte API at the moment so:

view v{ restricted{ptr} };
view w{ restricted{aligned<64>{ptr}} };

could be a thing. I can then just store the decorated pointer into the internal view_ptr.

psiha commented 2 years ago

I was thinking maybe use the decorators directly when passing the pointer to the view and let the deduction guide do its job. We're moving toward a more DG, less ugly tempalte API at the moment so:

DG?

view v{ restricted{ptr} };
view w{ restricted{aligned<64>{ptr}} };

But the decorators need to be part of the view type?

could be a thing. I can then just store the decorated pointer into the internal view_ptr.

AFAIK that can't be made to work with std alignas because it does not work on typedefs or pointers (i.e. the pointed to type)...maybe there is a trick with compiler specific attribute(( aligned() )) __declspec( align() )...

jfalcou commented 2 years ago

I was thinking maybe use the decorators directly when passing the pointer to the view and let the deduction guide do its job. We're moving toward a more DG, less ugly tempalte API at the moment so:

DG?

deduction guides

view v{ restricted{ptr} };
view w{ restricted{aligned<64>{ptr}} };

But the decorators need to be part of the type?

They'll be as the deduction guide will leverage the type of the constructor parameters

could be a thing. I can then just store the decorated pointer into the internal view_ptr.

AFAIK that can't be made to work with std alignas because it does not work on typedefs or pointers (i.e. the pointed to type)...maybe there is a trick with compiler specific attribute(( aligned() )) __declspec( align() )...

I was thinking of storing an instance of restricted or w/e decorator and use their get() to access the pointer.

psiha commented 2 years ago

Cool...something to show off with to the ML python engineers 😁

Just a 'tiny' leftover - restricted (pardon the pun) automatic conversion of view types (only) in the less constrained direction:

ps. personally i have no need for the may-alias-anything views (and probably very few people do), i put it out there just for completeness sake..

jfalcou commented 2 years ago

I was wondering about the conversion indeed. Good ideas.

As for restrict, it looks like on MSVC I need also the get() to be marked __decltype(restrict) or the restrict-ness don't get passed through. Can you confirm ?

psiha commented 2 years ago

Since it is inlined it should not be needed but hey it's MSVC so ... 🤦 😅 ... best to simply slap [[ gnu::returns_nonnull, gnu::malloc, msvc::restrict ]] on it :)

psiha commented 2 years ago

heads up 'bout the other ugly corporation - apple's stl still does not have std::assume_aligned yet msvc does oddly support __builtin_assume_aligned so go with that for the time being...

jfalcou commented 2 years ago

The whole restrict enchilada will ends up behind a macro I guess

psiha commented 2 years ago

The whole restrict enchilada will ends up behind a macro I guess

should be no need for that, even if for msvc 'inline is not just like a macro' (like it is for gcc and clang) the attribute will do the trick...i do it that way...

psiha commented 2 years ago

except msvc can't handle [[ msvc::restrict ]] ... you have to ifdef it for msvc and use the declspec syntax

psiha commented 2 years ago

also if by some chance you aren't yet sorry that i stumbled on in here 😅 - the aligned ptr attribute only handles/shaves of the loop prologue - to shave off the loop tail (for the less-than-simd or less-than-unroll part) we need a way to constrain the minor dimension of the extent:

jfalcou commented 2 years ago

well, for SIMD computation, we use a different process now. See EVE talk, we always load full register and mask the last one : https://www.youtube.com/watch?v=WZGNCPBMInI

jfalcou commented 2 years ago

oh and if you prefer a more direct way to interact, we have a Discord: https://discord.gg/8A4Q4HkhcW

jfalcou commented 2 years ago

Are you sure about the benefit of restrict ? Some random test: https://godbolt.org/z/W34s3Mf4o shows that it inhibit autovectorization and simpler example just plain don't work on clang/MSVC.

Do you happen to have a repeatable scenario where restrict is shown to be better ?

psiha commented 2 years ago

😳 lemme investigate 😳

psiha commented 2 years ago

ok this is truly uber weird, you really stepped on it here 😳 let's work with this version: https://godbolt.org/z/5bx7KPGjG

psiha commented 2 years ago

fwiw clang bug report https://github.com/llvm/llvm-project/issues/54193

jfalcou commented 2 years ago

Reported codegen issue to MSVC bug tracker

jfalcou commented 2 years ago

https://developercommunity.visualstudio.com/t/Incorrect-behavior-when-using-__restrict/1683917

jfalcou commented 2 years ago

UPVOTE FOR THE UPVOTE GOD OR MSVC WONT FIX: https://developercommunity.visualstudio.com/t/Incorrect-behavior-when-using-__restrict/1683917#T-N1688184