Unconditional use of SSE4.1 instructions

danfe commented 8 years ago

SeExpr uses two SSE-optimized functions in src/SeExpr/SeNoise.cpp code, floorSSE and roundSSE, which are implemented with SSE4.1 intrinsics, and thus requires -msse4.1 switch to build. This makes it not very portable and potentially unsafe if compiled code would be run on a pre-Penryn CPU (it would crash due to illegal instruction).

I've attempted to address this issue in recent commit to the FreeBSD port. Essentially, two changes are required, for CMakeLists.txt:

-  SET( CMAKE_CXX_FLAGS "-fPIC -msse4.1")
+  SET( CMAKE_CXX_FLAGS -fPIC )
+  IF(USE_SSE41)
+    SET( CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse4.1" )
+  ENDIF()

And src/SeExpr/SeNoise.cpp:

+#ifdef __SSE4_1__
 #include <smmintrin.h>
+#endif

 #include "SeExprBuiltins.h"
 namespace{
@@ -25,14 +27,18 @@ namespace{
 #include "SeNoise.h"
 namespace SeExpr{

+#ifdef __SSE4_1__
 inline double floorSSE(double val) {
-    return _mm_cvtsd_f64(_mm_floor_sd(_mm_set_sd(0.0), _mm_set_sd(val)));
+    return _mm_floor_sd(_mm_set_sd(0.0), _mm_set_sd(val))[0];
 }

 inline double roundSSE(double val) {
-    return _mm_cvtsd_f64(_mm_round_sd(_mm_set_sd(0.0), _mm_set_sd(val), _MM_FROUND_TO_NEAREST_INT));
+    return _mm_round_sd(_mm_set_sd(0.0), _mm_set_sd(val), _MM_FROUND_TO_NEAREST_INT)[0];
 }
-
+#else
+#define floorSSE floor
+#define roundSSE round
+#endif

I've also optimized away _mm_cvtsd_f64() call because access via [0] works (with GCC and Clang, at least, not sure about other compilers) and generates slightly more efficient code.

davvid commented 5 years ago

It's been a while, but if you're interested in upstreaming this change we'd be happy to take it. Adding an ENABLE_SSE4 configuration option knob with a default value of TRUE would be fairly low-impact. I'll close this issue for now (it's been a few years) but feel free to re-open it if you plan on submitting a PR.

danfe commented 5 years ago

It's been a while, but if you're interested in upstreaming this change we'd be happy to take it.

Yes, I'm interested upstreaming it, that's why I provided a patch (wow, it's been almost 4 years ago now).

I'll close this issue for now (it's been a few years) but feel free to re-open it if you plan on submitting a PR.

What's a PR? Isn't reference to a corresponding FreeBSD change enough to deduce a patch you'd like?

davvid commented 5 years ago

You're funny, and I know you're being sarcastic, but I can pretend to be dumb too:

A reference to a FreeBSD change is only half the battle. If I commit the code, should I give you credit? For example, when I submit code to the Git project, i follow their conventions and email patches to a mailing list. What I get in return is that the upstream project will then forever maintain the code. Thus, the common expected courtesy would be for you to push a git commit to a public repository that I can pull from, since that is our local convention here.

The assumption in that situation is that the goal of a contributor is to make things as easy as possible for the maintainer so that they actually take your change.

That said, since I'm not smart enough to deduce a patch, how can we be sure that I won't screw it up if I try to do it? The only real way to do it right is to get a BSD Unix hacker to do it :stuck_out_tongue:

I know you know this, though, I'm just being dumb ;-)

Alrighty, enough silliness. At least you actually provided a patch! I guess I'll reconstruct it against master, so please test out the candidate solution once I've pushed it out. You'll need to start setting the ENABLE_SSE cmake variable to FALSE in your port once it's ready.

danfe commented 5 years ago

Hm, I appreciate a decent reply. :-)

My problem with pull requests is that it seems silly to clone a repo and make some commits (pull request requires I commit my changes to the cloned repo first) when I just want to fix some minor issue, test it, svn diff > /tmp/some-fix.diff and be done with it. With cloning, I'm not sure what to do next. Is it safe to delete my cloned repo? Would doing it break any references in upstream repo? These things bug and annoy me a bit.

As for giving credit, it's such a minor change that I don't really care. Isn't giving a reference to this issue enough? All the names and links are here.

Last but not least, I'll surely test any candidates, or might even do what you ask and create a pull request since you're being nice and witty. :-)

davvid commented 5 years ago

Sweet ;-) the latest master should be good to go now, and double thanks for helping maintain the volunteer FreeBSD port. Seriously, the BSDs are quite noble projects.

davvid commented 5 years ago

@danfe I recently tried this on macOS and Apple LLVM version 10.0.1 (clang-1001.0.46.4) rejects the [0] sse syntax deref syntax, so I'm probably going to tweak that bit back to the original version with explicit conversions. Which clang version did you try?

wdas / SeExpr

Unconditional use of SSE4.1 instructions #35