google / highway

Performance-portable, length-agnostic SIMD with runtime dispatch
Apache License 2.0
4.13k stars 315 forks source link

HWY_EXPORT_AND_DYNAMIC_DISPATCH_T for multiple template parameters function with Clang/GCC #2260

Open Anarchitectus opened 3 months ago

Anarchitectus commented 3 months ago

Hi,

I'm facing difficulties to compile or execute properly with dynamic dispatch multi template parameter functions. The code compile and seems to work fine on MSVC but no luck with Clang and GCC. I tried to expand partially the HWY_EXPORT_AND_DYNAMIC_DISPATCH_T macro and managed to compile with clang, however at run time dynamic dispatch mechanism may call wrong functions (not always). See below a small reproduction for this issue. ``

undef HWY_TARGET_INCLUDE

define HWY_TARGET_INCLUDE "example.cpp"

include <hwy/foreach_target.h>

include <hwy/highway.h>

HWY_BEFORE_NAMESPACE();

namespace Test { namespace HWY_NAMESPACE { namespace hn = hwy::HWY_NAMESPACE; template void func1() {}

    template<int i, int j>
    void  func2(){}
}// namespace HWY_NAMESPACE

} // namespace SLHDR

HWY_AFTER_NAMESPACE();

define SINGLE_ARG(...) __VA_ARGS__

if HWY_ONCE

namespace Test {

template<int i>
void callerFunc1()
{
    HWY_EXPORT_AND_DYNAMIC_DISPATCH_T(func1<i>)(); //this works
}

template<int i, int j>
void callerFunc2()
{
    HWY_EXPORT_AND_DYNAMIC_DISPATCH_T(SINGLE_ARG(func2<i,j>))(); //this doesn't compile with clang nor gcc
    //lines below compiles, but doesn't work properly dynamic dispatch may use wrong function pointers
    //HWY_MAYBE_UNUSED static decltype(&HWY_STATIC_DISPATCH(SINGLE_ARG(func2< i,j>))) const
    //HWY_DISPATCH_TABLE(HWY_DISPATCH_TABLE_T())[1] = {&HWY_STATIC_DISPATCH(SINGLE_ARG(func2< i,j>))};HWY_DYNAMIC_DISPATCH_T(HWY_DISPATCH_TABLE_T())();
}

    template void callerFunc2<0,0>();

}

endif // HWY_ONCE

``

Thanks,

Anarchitectus

jan-wassenberg commented 3 months ago

Hi, we do document that this is intended for a single template argument. At a minimum, highway.h would probably have to wrap all FUNC_NAME in something like SINGLE_ARG - want to try that?

Likely easier: is it an option for you to wrap your i and j into a struct (with static constexpr kJ=j;) so that it can be a single template arg?

Anarchitectus commented 3 months ago

Hi,

Thanks for the answer, I noticed the HWY_EXPORT_AND_DYNAMIC_DISPATCH_T was documented for 1 params, but it didn't fit what I wanted to do unfortunately so I tried and MSVC was ok with it.

-It seems to work to add SINGLE_ARG all the way on FUNC_NAME with Clang, didn't try GCC.
-I tried using struct as template parameter (seems to be a c++20 features), and I think it would work as well for me. need to rewrite partially some of my code though.

Will a HWY_EXPORT_AND_DYNAMIC_DISPATCH_T for multi template parameters be considered in a future release ?

thanks.

jan-wassenberg commented 3 months ago

Makes sense. Thanks for confirming! Supporting a comma in macro args does seem to be a higher-risk (of breaking across compiler versions) thing that I'd prefer to avoid if we can, unless you or others really require multiple args?

Anarchitectus commented 3 months ago

update : this also seems to work for GCC, MSVC to be tested.

My situation is I'm trying to port some x86 SIMD code to highway, and there are a sizable number of template functions. so clearly the easiest for me is HWY_EXPORT_AND_DYNAMIC_DISPATCH_T for multi template parameters function working. I understand this impact some base macro from the API and is this a risk. So is that feature mandatory no, there are workarounds, is it practical and useful I think so yes.

jan-wassenberg commented 3 months ago

Yes, it's mainly MSVC that differs in its macro/comma parsing. I understand this would be useful given you have many functions. If you can verify your patch works in MSVC and send a pull request, we'd be OK to merge and maintain it :)