Open RoyiAvital opened 6 years ago
@carlkl , I'm not an expert. I generate DLL's with SSE, AVX, AVX2 on my Windows machine using GCC and they work.
For instance, have a look at - Fastest Implementation of Exponential Function Using AVX (Which is how I got to Sleef).
See the code displayed in Wim's Answer. I used that code and generated DLL on my system which runs perfectly. I'm not arguing that it works, I'm just adding my personal experience with it.
By the way, I use GCC 7.2.
Thank You.
If there is no function call in the program, it works. Probably the function is inlined in your program. Try putting the functions in different files and see it still works.
What do you mean? The DLL exposes functions to MATLAB. MATLAB calls this function (Which uses AVX2) and get out the result. So there is a function call to the DLL from outside.
@RoyiAvital, maybe you can get help from https://sourceforge.net/p/mingw-w64/mailman/mingw-w64-public
@carlkl , What do you mean help? I don't need any help.
I just tried compiling it with GCC and reported.
Next I will try with Intel Compiler.
Generating with Intel Compiler (ICC 18.0):
cmake -G"Visual Studio 15 2017 Win64" -T"Intel C++ Compiler 18.0" -DBUILD_SHARED_LIBS=FALSE ..
-- Selecting Windows SDK version 10.0.16299.0 to target Windows 10.0.14393.
-- The C compiler identification is Intel 18.0.1.20171018
-- Check for working C compiler: C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018/windows/bin/intel64/icl.exe
-- Check for working C compiler: C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018/windows/bin/intel64/icl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Setting build type to 'Release' (required for full support).
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of long double
-- Check size of long double - done
-- Performing Test COMPILER_SUPPORTS_FLOAT128
-- Performing Test COMPILER_SUPPORTS_FLOAT128 - Failed
-- Performing Test COMPILER_SUPPORTS_SSE2
-- Performing Test COMPILER_SUPPORTS_SSE2 - Success
-- Performing Test COMPILER_SUPPORTS_SSE4
-- Performing Test COMPILER_SUPPORTS_SSE4 - Success
-- Performing Test COMPILER_SUPPORTS_AVX
-- Performing Test COMPILER_SUPPORTS_AVX - Success
-- Performing Test COMPILER_SUPPORTS_FMA4
-- Performing Test COMPILER_SUPPORTS_FMA4 - Failed
-- Performing Test COMPILER_SUPPORTS_AVX2
-- Performing Test COMPILER_SUPPORTS_AVX2 - Success
-- Performing Test COMPILER_SUPPORTS_AVX512F
-- Performing Test COMPILER_SUPPORTS_AVX512F - Success
-- Could NOT find OpenMP_C (missing: OpenMP_libiomp5md_LIBRARY) (found version "5.0")
-- Could NOT find OpenMP (missing: OpenMP_C_FOUND)
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Failed
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Failed
-- Unroll target for DP : unroll_0_purecdp.c;unroll_1_purecdp.c;unroll_2_purecdp.c;unroll_3_purecdp.c;unroll_0_sse2dp.c;unroll_1_sse2dp.c;unroll_2_sse2dp.c;unroll_3_sse2dp.c;unroll_0_avxdp.c;unroll_1_avxdp.c;unroll_2_avxdp.c;unroll_3_avxdp.c;unroll_0_avx2dp.c;unroll_1_avx2dp.c;unroll_2_avx2dp.c;unroll_3_avx2dp.c;unroll_0_avx512fdp.c;unroll_1_avx512fdp.c;unroll_2_avx512fdp.c;unroll_3_avx512fdp.c
-- Unroll target for SP : unroll_0_purecsp.c;unroll_1_purecsp.c;unroll_2_purecsp.c;unroll_3_purecsp.c;unroll_0_sse2sp.c;unroll_1_sse2sp.c;unroll_2_sse2sp.c;unroll_3_sse2sp.c;unroll_0_avxsp.c;unroll_1_avxsp.c;unroll_2_avxsp.c;unroll_3_avxsp.c;unroll_0_avx2sp.c;unroll_1_avx2sp.c;unroll_2_avx2sp.c;unroll_3_avx2sp.c;unroll_0_avx512fsp.c;unroll_1_avx512fsp.c;unroll_2_avx512fsp.c;unroll_3_avx512fsp.c
-- The testing program for DFT is currently not available with MSVC build - skip building tests dft-tester
-- Configuring build for SLEEF-v3.1
Target system: Windows-10.0.14393
Target processor: AMD64
Host system: Windows-10.0.14393
Host processor: AMD64
Detected C compiler: Intel @ C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018/windows/bin/intel64/icl.exe
-- Using option `/D_CRT_SECURE_NO_WARNINGS ` to compile libsleef
-- Building shared libs : FALSE
-- MPFR : LIB_MPFR-NOTFOUND
-- GMP : LIBGMP-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP :
*** Note: Parallel build is not supported on Microsoft Visual Studio
-- Configuring done
-- Generating done
-- Build files have been written to:
Now I will do the build.
@RoyiAvital
I just found this:
use the -fno-asynchronous-unwind-tables
flag
https://stackoverflow.com/questions/43152633/invalid-register-for-seh-savexmm-in-cygwin ... If you don't need Windows Structured Exception support you can try using the -fno-asynchronous-unwind-tables option. This may however just mask some other underlying problem. Also some of the AVX512 instruction sets you've enabled are only supported on the Intel Xeon Phi x200, unless you're running Windows on one of those your code may not work. – Ross Ridge Apr 1 '17 at 4:35
BTW: can you show the cmake command for mingw again?
I'd pass AVX512 completely if I could for GCC on Windows compatibility. Anyhow, I'm not even sure how to do it :-).
The Intel build fails as well. Since the report is so large and there are many errors (I'm not those are all) I pasted the screen here:
If it helps you get ICC compatibility on Windows it is great.
Thank You.
I cannot do it without ICC for Windows.
@shibatch , Yea, you told me. I thought maybe the wrror would be a trivial thing. But it seems it is not.
Hopefully some capable student which can have it for free will give it a try. Appreciate your work and assistance. Sleef looks lovely!
Thank You.
I added preliminary support for MinGW.
https://github.com/shibatch/sleef/archive/Better_support_for_mingw.zip
AVX functions seem working. I cannot explain why it works.
Great News.
Going to try it. So far, the Generation Process:
cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE ..
-- The C compiler identification is GNU 7.2.0
-- Check for working C compiler: C:/Applications/MinGW/bin/gcc.exe
-- Check for working C compiler: C:/Applications/MinGW/bin/gcc.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Setting build type to 'Release' (required for full support).
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of long double
-- Check size of long double - done
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE - Success
-- Performing Test COMPILER_SUPPORTS_FLOAT128
-- Performing Test COMPILER_SUPPORTS_FLOAT128 - Success
-- Performing Test COMPILER_SUPPORTS_SSE2
-- Performing Test COMPILER_SUPPORTS_SSE2 - Success
-- Performing Test COMPILER_SUPPORTS_SSE4
-- Performing Test COMPILER_SUPPORTS_SSE4 - Success
-- Performing Test COMPILER_SUPPORTS_AVX
-- Performing Test COMPILER_SUPPORTS_AVX - Success
-- Performing Test COMPILER_SUPPORTS_FMA4
-- Performing Test COMPILER_SUPPORTS_FMA4 - Success
-- Performing Test COMPILER_SUPPORTS_AVX2
-- Performing Test COMPILER_SUPPORTS_AVX2 - Success
-- Performing Test COMPILER_SUPPORTS_AVX512F
-- Performing Test COMPILER_SUPPORTS_AVX512F - Success
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Performing Test COMPILER_SUPPORTS_OPENMP
-- Performing Test COMPILER_SUPPORTS_OPENMP - Success
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Success
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Success
-- Configuring build for SLEEF-v3.2
Target system: Windows-10.0.14393
Target processor: AMD64
Host system: Windows-10.0.14393
Host processor: AMD64
Detected C compiler: GNU @ C:/Applications/MinGW/bin/gcc.exe
-- Using option `-Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math -fno-asynchronous-unwind-tables` to compile libsleef
-- Building shared libs : FALSE
-- MPFR : LIB_MPFR-NOTFOUND
-- GMP : LIBGMP-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP : 1
-- Configuring done
-- Generating done
-- Build files have been written to:
It seems the flag -- /maxcpucount:1
isn't supported for build.
It makes me wonder, in what cases Sleef use Multi Threaded?
Does it have functions which works on arrays?
OK, I tried running the build process. This is the error I get:
Scanning dependencies of target headers
[ 13%] Generating ../../include/sleef.h
Generating sleef.h: mkrename "2" "4" "__m128d" "__m128" "__m128i" "__m128i" "__SSE2__"
Generating sleef.h: mkrename "2" "4" "__m128d" "__m128" "__m128i" "__m128i" "__SSE2__" "sse2"
Generating sleef.h: mkrename "2" "4" "__m128d" "__m128" "__m128i" "__m128i" "__SSE2__" "sse4"
Generating sleef.h: mkrename "4" "8" "__m256d" "__m256" "__m128i" "struct { __m128i x, y; }" "__AVX__"
Generating sleef.h: mkrename "4" "8" "__m256d" "__m256" "__m128i" "struct { __m128i x, y; }" "__AVX__" "avx"
Generating sleef.h: mkrename "4" "8" "__m256d" "__m256" "__m128i" "struct { __m128i x, y; }" "__AVX__" "fma4"
Generating sleef.h: mkrename "4" "8" "__m256d" "__m256" "__m128i" "__m256i" "__AVX__" "avx2"
Generating sleef.h: mkrename "2" "4" "__m128d" "__m128" "__m128i" "__m128i" "__SSE2__" "avx2128"
Generating sleef.h: mkrename "8" "16" "__m512d" "__m512" "__m256i" "__m512i" "__AVX512F__"
Generating sleef.h: mkrename "8" "16" "__m512d" "__m512" "__m256i" "__m512i" "__AVX512F__" "avx512f"
'cat' is not recognized as an internal or external command,
operable program or batch file.
mingw32-make.exe[2]: *** [src\libm\CMakeFiles\headers.dir\build.make:90: include/sleef.h] Error 1
mingw32-make.exe[2]: *** Deleting file 'include/sleef.h'
mingw32-make.exe[1]: *** [CMakeFiles\Makefile2:525: src/libm/CMakeFiles/headers.dir/all] Error 2
mingw32-make.exe: *** [Makefile:140: all] Error 2
Build it on cygwin shell. Cygwin has a package for mingw gcc.
@shibatch , Any chance trying it on MinGW64 and not Cygwin?
By the way, any chance enabling -- /maxcpucount:1
for GCC?
I still don't get where in Sleef Multi Threaded is used (Or is it only for DFT related).
As I couldn't find a function which works on arrays.
Cygwin is used just for building library. Cygwin dlls are not required to execute the functions. -- /maxcpucount:1 option is only for MSVC. You don't need this for gcc.
Libm functions in sleef can be used in multi-threaded code.
Any chance supporting MinGW64 directly and not through Cygwin? I don't know how to use Cygwin.
Regarding Multi Threaded, what the point enabling MP on a data as big as 128 / 256 / 512 Bits? I thought Libm functions have Multi Threaded versions when applied on arrays.
It's a good opportunity for you to learn how to use unix shells. Sleef is a building block for creating high performance software.
Hi,
I did some tests with MSVC compiled Library.
I compared exp()
for SSE4 and AVX2.
I tried 3 versions:
I tested run time on my machine (Core i7 6800K, 32 GB) and compared relative error to MATLAB's exp()
.
It seems Sleef was the slowest while all of them had error of less than 5e-7 on the range [-80, 80].
I don't know if it is due to MSVC.
I tried using GCC but Cygwin isn't working out for me.
It would be great to support -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE ..
so we could test it more.
Thank You.
It's not very easy to assess accuracy of math functions. Micro-benchmarking is not easy either. It's very delicate. First of all, how accurate is MATLAB's exp function? I cannot believe that that fast implementation of exp has comparable accuracy to SVML or SLEEF. The accuracy of SVML functions can be chosen via command line options. You seem measuring the absolute error, but that's not an appropriate way.
I seriously recommend you to learn how to use a unix shell or linux. How about installing ubuntu OS on your computer in a virtual machine? Try virtualbox. It's so easy and free. A unix shell is also easy to learn, since you already know how to use command prompt. It's like using "ls" instead of "dir".
Hi, But I need the Library on Windows. I have Virtual box of Linux, but Windows' performance are better (Except HD) and I prefer its polished experience.
In this case, since the command to build and compile are the same for Linux & Windows (Also for macOS) I don't see why it would matter.
It is only checking why -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE ..
doesn't work with MinGW64.
I only suggest that so Windows users will be able to enjoy better performance if GCC allows it.
Regarding accuracy, MATLAB's is IEEE-752 compliant. The accuracy was measured by:
abs(expMATLAB(x) - expSleef(x)) / abs(expMATLAB(x))
Where all of those compiled the same way (Sleef was linked form an MSVC build). The value of x ranged form -80 to 80 (1e6 samples, uniform distribution on the range). All 3 had the same error (~Less than 1e-7, which is perfect for me).
Thank You.
I think the maintainers of cmake expect users to use msys, then.
IEEE-754 compliance is another thing. The properties of that standard is required to make the computation accurate.
If you only need 1e-7 of relative error, then you can use float functions instead of double-precision functions.
All the above are using _ps
(Namely, Float Point, Single Precision).
Again, Sleef just slower (~30% slower).
I'm not sure what you mean by:
expect users to use msys
cmake supports MinGW. The project is not configured to run with it.
The execution speed depends on a few things, and the compiler also affects the performance. You also need to check the specification of each library regarding to accuracy. Accurate functions tend to require more time for computation.
It is possible that MinGW is not well supported by cmake. At least I confirmed that it works with MSYS.
@RoyiAvital ,
today the combination of msys2 and its mingw-w64 based toolchains and libraries are the gold standard to use the GCC toolchains targeting the win32 subsystem (32bit and 64bit).
see https://github.com/msys2/msys2/wiki
msys2 and mingw-w64 are suported on appveyor btw.
@shibatch , MinGW works perfectly with Cmake on Windows. Since I learned it few days ago I use it any day now. I really think that if you go to Windows, download the MinGW distribution I linked and check you'll see what's the error and be able to fix it easily.
Regarding accuracy, well all of them have the same accuracy more or less (~1-3e-7, in the range above). Sleef is just slower than the others. You can check against the code on StackOverflow by yourself.
@carlkl , I'm not even sure what's MSYS2 is. You support GCC, right? So MinGW give you GCC in Windows. I try compile with it and it won't work. I guess it has to do with flags for Windows which are only under the MSVC path and should be used for GCC / MinGW.
I'm a Windows user. I have in Linux and macOS and I find Windows to be better. Please, don't try move me from Windows. I'm here as a user and I can try to assist with my limited understanding. That's all.
I have been using MinGW since the era of gcc-2.95.
I checked the code. Sleef is slower because it accepts wider range of input.
@shibatch ,
I meant MinGW on Windows (In case there are other options, I'm not sure).
If you do work with MinGW, why doesn't -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE ..
work on Windows?
Why the need for MSYS2 or other wrappers?
Regarding speed.
What do you mean?
If you talk about the CPU Dispatch, then I use __m128 Sleef_expf4_u10sse4(__m128 a);
for SSE and __m256 Sleef_expf8_u10avx2(__m256 a);
for AVX so no dispatching.
What's strange is that on your Benchmark Page you clearly state it is faster than SVML while in my test it is not.
MSYS can be regarded as a part of mingw.
Computation speed is delicate. It depends on many things. You didn't even check the accuracy specification of each function. The way you did for measuring accuracy is not correct.
Please let me know how. I measure relative error and actually after checking it, Intel SVML is more accurate.
Again, I don't understand. You compare it with SVML (Which means they have similar accuracy) and show it has faster performance yet it doesn't.
But let's get back to the real thing.
Why doesn't -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE ..
work?
It has nothing to do with MSYS or anything else as it should generate a proper build that when used with make
it works.
Yet on Windows (Not virtual shell of Linux, but Windows) it doesn't.
Thank You.
cat command is included in msys. Making it work without msys would make the code dirty, and so I don't want to do that.
Measuring accuracy correctly is not what I teach you at a place like this.
@shibatch , Could you share installation command on MSYS2?
Leave alone accuracy.
In this benchmark:
You state your Exponent for _mm128
in Single Precision is faster than the equivalence of SVML.
How come I get different results?
@RoyiAvital,
the mingw-w64 installation commands are given in the https://github.com/shibatch/sleef/tree/Better_support_for_mingw branch. See appveyor.yml
Installation command is like "mkdir build;cd build;cmake ..;make".
Did you specify the correct options for icc? I see that input range is different. The benchmarking tool is included in the package, under src/libm-benchmarks.
@carlkl , @shibatch , I'm sorry. I launched MSYS2 terminal. What should I do to generate GCC compiled Libraries?
@shibatch ,
The range won't change the run time.
I really think that Intel is faster (Maybe Sleef 3.2 is faster, as the test state 3.2 while I have 3.1, no?).
At least when Sleef master compiled with MSVC it is slower than Intel's SVML on exp()
.
Anyhow, Sorry, but it seems you only want to work in Linux style. I'm a happy Windows user (Also happy Linux Mint user, Though for development nothing like Windows + MATLAB + Visual Studio for me) so I guess this is not for me. If you happen to support using GCC on Windows like MSVC (Namely without emulation of Unix's bash, Just using the command line and Cmake like for MSVC) I'd be happy to be your test guy.
Thank You.
The range won't change the run time.
This statement is true for SLEEF, because there are no branches in the SIMD code. I am not sure whether it is true for MATLAB or SVML, they might have a different algorithm. It might be worth checking what is the performance of SVML and MATLAB in the range supported by SLEEF, and also check the error. What are the results of your comparison if you check the performance in the range -100 to 100?
I really think that Intel is faster (Maybe Sleef 3.2 is faster, as the test state 3.2 while I have 3.1, no?).
If you are using master branch in github, you are essentially using 3.2.
@fpetrogalli-arm, There is also no branch in AVX / SSE MathFun function and it yields the same order of Relative Error yet it is faster. I will test it on [-100, 100]. I'd put my money results will be the same.
It's either MSVC compilation hurts performance or something really tricky. I want to give a try on GCC but it seems the project doesn't support MinGW64 on Windows.
Anyhow, I'm happy to see there is Sleef in the world. Yet if I look for portable (Support for Windows, macOS, Linux) Math Library, VSML is both faster (On my machine) and easier to use (Unless you want to start using Emulation of Unix terminal in Windows).
@fpetrogalli-arm , @shibatch , I ran the test on the range [-100, 100]. I put all data into a MATLAB Struct so you can have a look as well (See ZIP file below). In the struct you'll find the input values (Uniform on the range [-100, 100]). You will also find MATLAB's output as reference.
There are 8 other fields:
MathFun indeed doesn't work well outside [-80, 80] range. Though within this range it is as accurate as Sleef but faster. MathFun Fast is a little less accurate but 30% faster than MathFun which is faster than Sleef to begin with. SVML is both is a accurate than Sleef yet faster.
Again, it might be a compiler thing.
Once it works with MinGW64 (Using -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE ..
) I will re run the test and maybe Sleef will get faster.
Anyhow, amazing work!
@shibatch,
I managed to compile sleef as a static library with mingw-w64 gcc-7.2.0 and with the help of the msys2 shell . However, I didn`t manage to link to this library:
$ gcc simple_test.c -o simple_test -I ./include -L ./lib -lsleef D:\devel\tmp\msys64\tmp\ccjDOnde.o:simple_test.c:(.text+0xa9): undefined reference to `__imp_Sleef_powd2_u10' collect2.exe: error: ld returned 1 exit status
It seems, that SLEEF_STATIC_LIBS is not defined and the functions are exported as __declspec(dllexport).
How to set SLEEF_STATIC_LIBS correctly?
@carlkl Please specify -DSLEEF_STATIC_LIBS to the compiler. I need to check how other libraries are handling this problem.
@RoyiAvital,
I compiled sleef with the help of mingw-w64 (gcc version 7.2.0 x86_64-posix-seh-rev1). Could you test the attached sleef.dll in comparison to the MSVC one? sleef-3.2_mingw-w64.zip
@carlkl,
Any chance you make it a lib
file so I will be able to use it as static library?
Thank You.
You may tryout these one (I didn't tested it): sleef_VS2015.zip
@carlkl , I see in its name it is called VS 2015, does it mean it was compiled with VS 2015? As I already have compiled my own version with VS 2017. I was looking for compatibility with GCC (Without requirement for Unix Terminal Emulation).
@RoyiAvital,
the libraries and the DLL are compiled with GCC https://github.com/shibatch/sleef/issues/172#issuecomment-369282789; the dynamic import library for sleef.dll is included in sleef_VS2015.zip and was created with the help of VS2015.
Hence two variant of static libraries are available for testing:
Both static library files are compiled with GCC, the latter one is archived by VS2015.
It would be good to get a feedback.
@carlkl ,
I previously created projects based on sleef.lib
created by Visual Studio 2017.
The file was ~3.2 MB and was replaced by the sleef.lib
(800 KB) in your sleef_VS2015.zip.
Then the project fails stating:
Error LNK2019 unresolved external symbol Sleef_expf8_u10avx2 referenced in function main MathLibAnalysis
The same project works perfectly with the VS 2017 library.
Thank You.
I now tested both variants of the static import libraries myself and failed with both (VS2015). I willl came up with a new one soon. For now you may try the dynamic import library for VS included in sleef_VS2015.zip to link against sleef.dll included in sleef-3.2_mingw-w64.zip. This worked for my in the small testprogramm given on the sleef.org website. The sleef.dll has to be placed alongside the simple_test.exe. I made some small changes at the programmm header.
#include <stdio.h>
/* SSE intrinsics for GCC and MSVC */
#if defined(_MSC_VER)
#include <intrin.h>
#elif defined(__GNUC__) && (defined(__x86_64__) || defined(__i386__))
#include <x86intrin.h>
#endif
/* Using SSE2 is default for MSVC 64bit. However __SSE2__ is not defined but needed by sleef.h */
#if defined(_MSC_VER)
#define __SSE2__
#endif
#include "sleef.h"
int main(int argc, char **argv) {
double a[] = {2, 10};
double b[] = {3, 20};
__m128d va, vb, vc;
va = _mm_loadu_pd(a);
vb = _mm_loadu_pd(b);
vc = Sleef_powd2_u10(va, vb);
double c[2];
_mm_storeu_pd(c, vc);
printf("pow(%g, %g) = %g\n", a[0], b[0], c[0]);
printf("pow(%g, %g) = %g\n", a[1], b[1], c[1]);
}
/*
usage:
./simple_test
pow(2, 3) = 8
pow(10, 20) = 1e+20
*/
I uploaded compiled libraries for windows.
https://github.com/shibatch/sleef/releases/download/3.2/sleef-3.2-win.zip
@carlkl , Let me know when you have updated version of the Static Lib compiled with GCC. I'd be happy to check it out.
But the real deal is to have GCC support on CMake on Windows.
Hello, I'm trying to install Sleef on Windows. My system is Windows 10 Pro 64 Bit with Visual Studio 2017 (15.5.6).
The problem is Cmake requires a file called
CMakeLists.txt
.It seems this file is available in the
master
branch of GitHub yet it is not part of theReleases
.If one download the ZIP form GitHub Releases or SourceForge he doesn't get the
CMakeLists.txt
file.What should I do? Go with Master (Which I assume isn't validated as stable releases) or is the any other way?
Thank You.