ebassi / graphene

A thin layer of graphic data types
http://ebassi.github.io/graphene
Other
373 stars 80 forks source link

Attempt to consolidate SSE and ARM NEON SIMD code for GCC/clang and Visual Studio #252

Open fanc999 opened 2 years ago

fanc999 commented 2 years ago

Hi,

This attempts to clean up the code a bit in graphene-simd4f.h and graphene-simd4x4f.h by trying to reduce the code duplication for SSE and ARM NEON SIMD implementation due to syntactical differences in Visual Studio and GCC/clang in regards to inlining, via:

[1]: Sadly, I was not able to do the cleanup for the SIMD code that are done in a function-like manner. I couldn't get the preprocessor happy in one shot for Visual Studio and clang, ugh :|, so I had to leave that alone, since preprocessors don't allow a working #define inside a macro and doesn't like splitting lines when set apart by #if/#ifdef's. So this is the best I could do for now. For instance:

(unrelated parts omitted for brevity, trying to remember things on top of my head, so there might be some mistakes below)

(graphene-macros.h)
#if defined (__GNUC__) || defined (__clang__)
...
#define GRAPHENE_FUNCCALL_2ARG_MACRO(ftype,fname,v0,v1) \
  (__extension({

#define GRAPHENE_FUNCCALL_2ARG_BEGIN(rtype,ftype,fname,t0,v0,t1,v1)
#define GRAPHENE_FUNCCALL_BODY(expr) expr;
#define GRAPHENE_FUNCCALL_RETURN(rtype,rvalue) (rtype) rvalue;
#define GRAPHENE_FUNCCALL_END \
  }))
#elif defined (_MSC_VER)
...
#define GRAPHENE_FUNCCALL_2ARG_MACRO(ftype,fname,v0,v1) \
  graphene_msvc_##ftype##_##fname## (v0, v1)

#define GRAPHENE_FUNCCALL_2ARG_BEGIN(rtype,ftype,fname,t0,v0,t1,v1) \
static inline rtype \
graphene_msvc_##ftype##_##fname## (t0 v0, t1 v1) \
{

#define GRAPHENE_FUNCCALL_BODY(expr) expr;
#define GRAPHENE_FUNCCALL_RETURN(rtype,rvalue) return rvalue;

#define GRAPHENE_FUNCCALL_END \
}
#else
...

(graphene-simd4f.h)
...
#  define graphene_simd4f_get(s,i) \
  GRAPHENE_FUNCCALL_2ARG_MACRO (simd4f, get,s ,i) \ /* for this line, it's either with the trailing backslash for GCC/clang or without it for MSVC :(, otherwise other lines here all work */
  GRAPHENE_FUNCCALL_2ARG_BEGIN (float, simd4f, get, graphene_simd4f_t, int, s, i) \
  GRAPHENE_FUNCCALL_BODY (graphene_simd4f_union_t __u = { (s) }) \
  GRAPHENE_FUNCCALL_RETURN (float, __u.f[(i)]) \
  GRAPHENE_FUNCCALL_END
...

I understand that this PR might well conflict with the changes in #251, so if one of this or #251 goes through, I will fix things up as needed as soon as possible.

With blessings, thank you!