shibatch / sleef

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
https://sleef.org
Boost Software License 1.0
674 stars 134 forks source link

Installation, Configuration and Usage Under Windows #172

Open RoyiAvital opened 6 years ago

RoyiAvital commented 6 years ago

Hello, I'm trying to install Sleef on Windows. My system is Windows 10 Pro 64 Bit with Visual Studio 2017 (15.5.6).

The problem is Cmake requires a file called CMakeLists.txt.
It seems this file is available in the master branch of GitHub yet it is not part of the Releases.
If one download the ZIP form GitHub Releases or SourceForge he doesn't get the CMakeLists.txt file.

What should I do? Go with Master (Which I assume isn't validated as stable releases) or is the any other way?

Thank You.

RoyiAvital commented 6 years ago

@carlkl , I'm not an expert. I generate DLL's with SSE, AVX, AVX2 on my Windows machine using GCC and they work.

For instance, have a look at - Fastest Implementation of Exponential Function Using AVX (Which is how I got to Sleef).

See the code displayed in Wim's Answer. I used that code and generated DLL on my system which runs perfectly. I'm not arguing that it works, I'm just adding my personal experience with it.

By the way, I use GCC 7.2.

Thank You.

shibatch commented 6 years ago

If there is no function call in the program, it works. Probably the function is inlined in your program. Try putting the functions in different files and see it still works.

RoyiAvital commented 6 years ago

What do you mean? The DLL exposes functions to MATLAB. MATLAB calls this function (Which uses AVX2) and get out the result. So there is a function call to the DLL from outside.

carlkl commented 6 years ago

@RoyiAvital, maybe you can get help from https://sourceforge.net/p/mingw-w64/mailman/mingw-w64-public

RoyiAvital commented 6 years ago

@carlkl , What do you mean help? I don't need any help.

I just tried compiling it with GCC and reported.

Next I will try with Intel Compiler.

RoyiAvital commented 6 years ago

Generating with Intel Compiler (ICC 18.0):

cmake -G"Visual Studio 15 2017 Win64" -T"Intel C++ Compiler 18.0" -DBUILD_SHARED_LIBS=FALSE ..
-- Selecting Windows SDK version 10.0.16299.0 to target Windows 10.0.14393.
-- The C compiler identification is Intel 18.0.1.20171018
-- Check for working C compiler: C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018/windows/bin/intel64/icl.exe
-- Check for working C compiler: C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018/windows/bin/intel64/icl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Setting build type to 'Release' (required for full support).
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of long double
-- Check size of long double - done
-- Performing Test COMPILER_SUPPORTS_FLOAT128
-- Performing Test COMPILER_SUPPORTS_FLOAT128 - Failed
-- Performing Test COMPILER_SUPPORTS_SSE2
-- Performing Test COMPILER_SUPPORTS_SSE2 - Success
-- Performing Test COMPILER_SUPPORTS_SSE4
-- Performing Test COMPILER_SUPPORTS_SSE4 - Success
-- Performing Test COMPILER_SUPPORTS_AVX
-- Performing Test COMPILER_SUPPORTS_AVX - Success
-- Performing Test COMPILER_SUPPORTS_FMA4
-- Performing Test COMPILER_SUPPORTS_FMA4 - Failed
-- Performing Test COMPILER_SUPPORTS_AVX2
-- Performing Test COMPILER_SUPPORTS_AVX2 - Success
-- Performing Test COMPILER_SUPPORTS_AVX512F
-- Performing Test COMPILER_SUPPORTS_AVX512F - Success
-- Could NOT find OpenMP_C (missing: OpenMP_libiomp5md_LIBRARY) (found version "5.0")
-- Could NOT find OpenMP (missing: OpenMP_C_FOUND)
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Failed
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Failed
-- Unroll target for DP : unroll_0_purecdp.c;unroll_1_purecdp.c;unroll_2_purecdp.c;unroll_3_purecdp.c;unroll_0_sse2dp.c;unroll_1_sse2dp.c;unroll_2_sse2dp.c;unroll_3_sse2dp.c;unroll_0_avxdp.c;unroll_1_avxdp.c;unroll_2_avxdp.c;unroll_3_avxdp.c;unroll_0_avx2dp.c;unroll_1_avx2dp.c;unroll_2_avx2dp.c;unroll_3_avx2dp.c;unroll_0_avx512fdp.c;unroll_1_avx512fdp.c;unroll_2_avx512fdp.c;unroll_3_avx512fdp.c
-- Unroll target for SP : unroll_0_purecsp.c;unroll_1_purecsp.c;unroll_2_purecsp.c;unroll_3_purecsp.c;unroll_0_sse2sp.c;unroll_1_sse2sp.c;unroll_2_sse2sp.c;unroll_3_sse2sp.c;unroll_0_avxsp.c;unroll_1_avxsp.c;unroll_2_avxsp.c;unroll_3_avxsp.c;unroll_0_avx2sp.c;unroll_1_avx2sp.c;unroll_2_avx2sp.c;unroll_3_avx2sp.c;unroll_0_avx512fsp.c;unroll_1_avx512fsp.c;unroll_2_avx512fsp.c;unroll_3_avx512fsp.c
-- The testing program for DFT is currently not available with MSVC build - skip building tests dft-tester
-- Configuring build for SLEEF-v3.1
   Target system: Windows-10.0.14393
   Target processor: AMD64
   Host system: Windows-10.0.14393
   Host processor: AMD64
   Detected C compiler: Intel @ C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018/windows/bin/intel64/icl.exe
-- Using option `/D_CRT_SECURE_NO_WARNINGS  ` to compile libsleef
-- Building shared libs : FALSE
-- MPFR : LIB_MPFR-NOTFOUND
-- GMP : LIBGMP-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP :

*** Note: Parallel build is not supported on Microsoft Visual Studio
-- Configuring done
-- Generating done
-- Build files have been written to:

Now I will do the build.

carlkl commented 6 years ago

@RoyiAvital

I just found this:

use the -fno-asynchronous-unwind-tables flag

https://stackoverflow.com/questions/43152633/invalid-register-for-seh-savexmm-in-cygwin ... If you don't need Windows Structured Exception support you can try using the -fno-asynchronous-unwind-tables option. This may however just mask some other underlying problem. Also some of the AVX512 instruction sets you've enabled are only supported on the Intel Xeon Phi x200, unless you're running Windows on one of those your code may not work. – Ross Ridge Apr 1 '17 at 4:35

BTW: can you show the cmake command for mingw again?

RoyiAvital commented 6 years ago

I'd pass AVX512 completely if I could for GCC on Windows compatibility. Anyhow, I'm not even sure how to do it :-).

The Intel build fails as well. Since the report is so large and there are many errors (I'm not those are all) I pasted the screen here:

https://paste.ee/p/83Fh0

If it helps you get ICC compatibility on Windows it is great.

Thank You.

shibatch commented 6 years ago

I cannot do it without ICC for Windows.

RoyiAvital commented 6 years ago

@shibatch , Yea, you told me. I thought maybe the wrror would be a trivial thing. But it seems it is not.

Hopefully some capable student which can have it for free will give it a try. Appreciate your work and assistance. Sleef looks lovely!

Thank You.

shibatch commented 6 years ago

I added preliminary support for MinGW.

https://github.com/shibatch/sleef/archive/Better_support_for_mingw.zip

AVX functions seem working. I cannot explain why it works.

RoyiAvital commented 6 years ago

Great News.

Going to try it. So far, the Generation Process:

cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE ..
-- The C compiler identification is GNU 7.2.0
-- Check for working C compiler: C:/Applications/MinGW/bin/gcc.exe
-- Check for working C compiler: C:/Applications/MinGW/bin/gcc.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Setting build type to 'Release' (required for full support).
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of long double
-- Check size of long double - done
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE - Success
-- Performing Test COMPILER_SUPPORTS_FLOAT128
-- Performing Test COMPILER_SUPPORTS_FLOAT128 - Success
-- Performing Test COMPILER_SUPPORTS_SSE2
-- Performing Test COMPILER_SUPPORTS_SSE2 - Success
-- Performing Test COMPILER_SUPPORTS_SSE4
-- Performing Test COMPILER_SUPPORTS_SSE4 - Success
-- Performing Test COMPILER_SUPPORTS_AVX
-- Performing Test COMPILER_SUPPORTS_AVX - Success
-- Performing Test COMPILER_SUPPORTS_FMA4
-- Performing Test COMPILER_SUPPORTS_FMA4 - Success
-- Performing Test COMPILER_SUPPORTS_AVX2
-- Performing Test COMPILER_SUPPORTS_AVX2 - Success
-- Performing Test COMPILER_SUPPORTS_AVX512F
-- Performing Test COMPILER_SUPPORTS_AVX512F - Success
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Performing Test COMPILER_SUPPORTS_OPENMP
-- Performing Test COMPILER_SUPPORTS_OPENMP - Success
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Success
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Success
-- Configuring build for SLEEF-v3.2
   Target system: Windows-10.0.14393
   Target processor: AMD64
   Host system: Windows-10.0.14393
   Host processor: AMD64
   Detected C compiler: GNU @ C:/Applications/MinGW/bin/gcc.exe
-- Using option `-Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math -fno-asynchronous-unwind-tables` to compile libsleef
-- Building shared libs : FALSE
-- MPFR : LIB_MPFR-NOTFOUND
-- GMP : LIBGMP-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP : 1
-- Configuring done
-- Generating done
-- Build files have been written to:

It seems the flag -- /maxcpucount:1 isn't supported for build. It makes me wonder, in what cases Sleef use Multi Threaded? Does it have functions which works on arrays?

RoyiAvital commented 6 years ago

OK, I tried running the build process. This is the error I get:

Scanning dependencies of target headers
[ 13%] Generating ../../include/sleef.h
Generating sleef.h: mkrename "2" "4" "__m128d" "__m128" "__m128i" "__m128i" "__SSE2__"
Generating sleef.h: mkrename "2" "4" "__m128d" "__m128" "__m128i" "__m128i" "__SSE2__" "sse2"
Generating sleef.h: mkrename "2" "4" "__m128d" "__m128" "__m128i" "__m128i" "__SSE2__" "sse4"
Generating sleef.h: mkrename "4" "8" "__m256d" "__m256" "__m128i" "struct { __m128i x, y; }" "__AVX__"
Generating sleef.h: mkrename "4" "8" "__m256d" "__m256" "__m128i" "struct { __m128i x, y; }" "__AVX__" "avx"
Generating sleef.h: mkrename "4" "8" "__m256d" "__m256" "__m128i" "struct { __m128i x, y; }" "__AVX__" "fma4"
Generating sleef.h: mkrename "4" "8" "__m256d" "__m256" "__m128i" "__m256i" "__AVX__" "avx2"
Generating sleef.h: mkrename "2" "4" "__m128d" "__m128" "__m128i" "__m128i" "__SSE2__" "avx2128"
Generating sleef.h: mkrename "8" "16" "__m512d" "__m512" "__m256i" "__m512i" "__AVX512F__"
Generating sleef.h: mkrename "8" "16" "__m512d" "__m512" "__m256i" "__m512i" "__AVX512F__" "avx512f"
'cat' is not recognized as an internal or external command,
operable program or batch file.
mingw32-make.exe[2]: *** [src\libm\CMakeFiles\headers.dir\build.make:90: include/sleef.h] Error 1
mingw32-make.exe[2]: *** Deleting file 'include/sleef.h'
mingw32-make.exe[1]: *** [CMakeFiles\Makefile2:525: src/libm/CMakeFiles/headers.dir/all] Error 2
mingw32-make.exe: *** [Makefile:140: all] Error 2
shibatch commented 6 years ago

Build it on cygwin shell. Cygwin has a package for mingw gcc.

RoyiAvital commented 6 years ago

@shibatch , Any chance trying it on MinGW64 and not Cygwin?

By the way, any chance enabling -- /maxcpucount:1 for GCC? I still don't get where in Sleef Multi Threaded is used (Or is it only for DFT related). As I couldn't find a function which works on arrays.

shibatch commented 6 years ago

Cygwin is used just for building library. Cygwin dlls are not required to execute the functions. -- /maxcpucount:1 option is only for MSVC. You don't need this for gcc.

Libm functions in sleef can be used in multi-threaded code.

RoyiAvital commented 6 years ago

Any chance supporting MinGW64 directly and not through Cygwin? I don't know how to use Cygwin.

Regarding Multi Threaded, what the point enabling MP on a data as big as 128 / 256 / 512 Bits? I thought Libm functions have Multi Threaded versions when applied on arrays.

shibatch commented 6 years ago

It's a good opportunity for you to learn how to use unix shells. Sleef is a building block for creating high performance software.

RoyiAvital commented 6 years ago

Hi,

I did some tests with MSVC compiled Library. I compared exp() for SSE4 and AVX2.

I tried 3 versions:

  1. Intel SVML.
  2. Code based on Fastest Implementation of Exponential Function Using AVX (Easily adaptable to SSE).
  3. Sleef.

I tested run time on my machine (Core i7 6800K, 32 GB) and compared relative error to MATLAB's exp(). It seems Sleef was the slowest while all of them had error of less than 5e-7 on the range [-80, 80].

I don't know if it is due to MSVC. I tried using GCC but Cygwin isn't working out for me. It would be great to support -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE .. so we could test it more.

Thank You.

shibatch commented 6 years ago

It's not very easy to assess accuracy of math functions. Micro-benchmarking is not easy either. It's very delicate. First of all, how accurate is MATLAB's exp function? I cannot believe that that fast implementation of exp has comparable accuracy to SVML or SLEEF. The accuracy of SVML functions can be chosen via command line options. You seem measuring the absolute error, but that's not an appropriate way.

shibatch commented 6 years ago

I seriously recommend you to learn how to use a unix shell or linux. How about installing ubuntu OS on your computer in a virtual machine? Try virtualbox. It's so easy and free. A unix shell is also easy to learn, since you already know how to use command prompt. It's like using "ls" instead of "dir".

RoyiAvital commented 6 years ago

Hi, But I need the Library on Windows. I have Virtual box of Linux, but Windows' performance are better (Except HD) and I prefer its polished experience.

In this case, since the command to build and compile are the same for Linux & Windows (Also for macOS) I don't see why it would matter. It is only checking why -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE .. doesn't work with MinGW64. I only suggest that so Windows users will be able to enjoy better performance if GCC allows it.

Regarding accuracy, MATLAB's is IEEE-752 compliant. The accuracy was measured by:

abs(expMATLAB(x) - expSleef(x)) / abs(expMATLAB(x))

Where all of those compiled the same way (Sleef was linked form an MSVC build). The value of x ranged form -80 to 80 (1e6 samples, uniform distribution on the range). All 3 had the same error (~Less than 1e-7, which is perfect for me).

Thank You.

shibatch commented 6 years ago

I think the maintainers of cmake expect users to use msys, then.

IEEE-754 compliance is another thing. The properties of that standard is required to make the computation accurate.

If you only need 1e-7 of relative error, then you can use float functions instead of double-precision functions.

RoyiAvital commented 6 years ago

All the above are using _ps (Namely, Float Point, Single Precision). Again, Sleef just slower (~30% slower).

I'm not sure what you mean by:

expect users to use msys

cmake supports MinGW. The project is not configured to run with it.

shibatch commented 6 years ago

The execution speed depends on a few things, and the compiler also affects the performance. You also need to check the specification of each library regarding to accuracy. Accurate functions tend to require more time for computation.

It is possible that MinGW is not well supported by cmake. At least I confirmed that it works with MSYS.

carlkl commented 6 years ago

@RoyiAvital ,

today the combination of msys2 and its mingw-w64 based toolchains and libraries are the gold standard to use the GCC toolchains targeting the win32 subsystem (32bit and 64bit).

see https://github.com/msys2/msys2/wiki

msys2 and mingw-w64 are suported on appveyor btw.

RoyiAvital commented 6 years ago

@shibatch , MinGW works perfectly with Cmake on Windows. Since I learned it few days ago I use it any day now. I really think that if you go to Windows, download the MinGW distribution I linked and check you'll see what's the error and be able to fix it easily.

Regarding accuracy, well all of them have the same accuracy more or less (~1-3e-7, in the range above). Sleef is just slower than the others. You can check against the code on StackOverflow by yourself.

@carlkl , I'm not even sure what's MSYS2 is. You support GCC, right? So MinGW give you GCC in Windows. I try compile with it and it won't work. I guess it has to do with flags for Windows which are only under the MSVC path and should be used for GCC / MinGW.

I'm a Windows user. I have in Linux and macOS and I find Windows to be better. Please, don't try move me from Windows. I'm here as a user and I can try to assist with my limited understanding. That's all.

shibatch commented 6 years ago

I have been using MinGW since the era of gcc-2.95.

I checked the code. Sleef is slower because it accepts wider range of input.

RoyiAvital commented 6 years ago

@shibatch , I meant MinGW on Windows (In case there are other options, I'm not sure). If you do work with MinGW, why doesn't -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE .. work on Windows? Why the need for MSYS2 or other wrappers?

Regarding speed. What do you mean? If you talk about the CPU Dispatch, then I use __m128 Sleef_expf4_u10sse4(__m128 a); for SSE and __m256 Sleef_expf8_u10avx2(__m256 a); for AVX so no dispatching.

What's strange is that on your Benchmark Page you clearly state it is faster than SVML while in my test it is not.

shibatch commented 6 years ago

MSYS can be regarded as a part of mingw.

Computation speed is delicate. It depends on many things. You didn't even check the accuracy specification of each function. The way you did for measuring accuracy is not correct.

RoyiAvital commented 6 years ago

Please let me know how. I measure relative error and actually after checking it, Intel SVML is more accurate.

Again, I don't understand. You compare it with SVML (Which means they have similar accuracy) and show it has faster performance yet it doesn't.

But let's get back to the real thing. Why doesn't -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE .. work? It has nothing to do with MSYS or anything else as it should generate a proper build that when used with makeit works. Yet on Windows (Not virtual shell of Linux, but Windows) it doesn't.

Thank You.

shibatch commented 6 years ago

cat command is included in msys. Making it work without msys would make the code dirty, and so I don't want to do that.

Measuring accuracy correctly is not what I teach you at a place like this.

RoyiAvital commented 6 years ago

@shibatch , Could you share installation command on MSYS2?

Leave alone accuracy. In this benchmark: image You state your Exponent for _mm128 in Single Precision is faster than the equivalence of SVML. How come I get different results?

carlkl commented 6 years ago

@RoyiAvital,

the mingw-w64 installation commands are given in the https://github.com/shibatch/sleef/tree/Better_support_for_mingw branch. See appveyor.yml

shibatch commented 6 years ago

Installation command is like "mkdir build;cd build;cmake ..;make".

Did you specify the correct options for icc? I see that input range is different. The benchmarking tool is included in the package, under src/libm-benchmarks.

RoyiAvital commented 6 years ago

@carlkl , @shibatch , I'm sorry. I launched MSYS2 terminal. What should I do to generate GCC compiled Libraries?

@shibatch , The range won't change the run time. I really think that Intel is faster (Maybe Sleef 3.2 is faster, as the test state 3.2 while I have 3.1, no?). At least when Sleef master compiled with MSVC it is slower than Intel's SVML on exp().

Anyhow, Sorry, but it seems you only want to work in Linux style. I'm a happy Windows user (Also happy Linux Mint user, Though for development nothing like Windows + MATLAB + Visual Studio for me) so I guess this is not for me. If you happen to support using GCC on Windows like MSVC (Namely without emulation of Unix's bash, Just using the command line and Cmake like for MSVC) I'd be happy to be your test guy.

Thank You.

fpetrogalli commented 6 years ago

The range won't change the run time.

This statement is true for SLEEF, because there are no branches in the SIMD code. I am not sure whether it is true for MATLAB or SVML, they might have a different algorithm. It might be worth checking what is the performance of SVML and MATLAB in the range supported by SLEEF, and also check the error. What are the results of your comparison if you check the performance in the range -100 to 100?

I really think that Intel is faster (Maybe Sleef 3.2 is faster, as the test state 3.2 while I have 3.1, no?).

If you are using master branch in github, you are essentially using 3.2.

RoyiAvital commented 6 years ago

@fpetrogalli-arm, There is also no branch in AVX / SSE MathFun function and it yields the same order of Relative Error yet it is faster. I will test it on [-100, 100]. I'd put my money results will be the same.

It's either MSVC compilation hurts performance or something really tricky. I want to give a try on GCC but it seems the project doesn't support MinGW64 on Windows.

Anyhow, I'm happy to see there is Sleef in the world. Yet if I look for portable (Support for Windows, macOS, Linux) Math Library, VSML is both faster (On my machine) and easier to use (Unless you want to start using Emulation of Unix terminal in Windows).

RoyiAvital commented 6 years ago

@fpetrogalli-arm , @shibatch , I ran the test on the range [-100, 100]. I put all data into a MATLAB Struct so you can have a look as well (See ZIP file below). In the struct you'll find the input values (Uniform on the range [-100, 100]). You will also find MATLAB's output as reference.

There are 8 other fields:

  1. Sleef: SSE + AVX.
  2. MathFun: SSE + AVX.
  3. MathFun Fast: SSE + AVX.
  4. SVML: SSE + AVX.

MathFun indeed doesn't work well outside [-80, 80] range. Though within this range it is as accurate as Sleef but faster. MathFun Fast is a little less accurate but 30% faster than MathFun which is faster than Sleef to begin with. SVML is both is a accurate than Sleef yet faster.

Again, it might be a compiler thing. Once it works with MinGW64 (Using -cmake -G"MinGW Makefiles" -DBUILD_SHARED_LIBS=FALSE ..) I will re run the test and maybe Sleef will get faster.

Anyhow, amazing work!

TestData.zip

carlkl commented 6 years ago

@shibatch,

I managed to compile sleef as a static library with mingw-w64 gcc-7.2.0 and with the help of the msys2 shell . However, I didn`t manage to link to this library:

$ gcc simple_test.c -o simple_test -I ./include -L ./lib -lsleef D:\devel\tmp\msys64\tmp\ccjDOnde.o:simple_test.c:(.text+0xa9): undefined reference to `__imp_Sleef_powd2_u10' collect2.exe: error: ld returned 1 exit status

It seems, that SLEEF_STATIC_LIBS is not defined and the functions are exported as __declspec(dllexport).

How to set SLEEF_STATIC_LIBS correctly?

shibatch commented 6 years ago

@carlkl Please specify -DSLEEF_STATIC_LIBS to the compiler. I need to check how other libraries are handling this problem.

carlkl commented 6 years ago

@RoyiAvital,

I compiled sleef with the help of mingw-w64 (gcc version 7.2.0 x86_64-posix-seh-rev1). Could you test the attached sleef.dll in comparison to the MSVC one? sleef-3.2_mingw-w64.zip

RoyiAvital commented 6 years ago

@carlkl, Any chance you make it a lib file so I will be able to use it as static library?

Thank You.

carlkl commented 6 years ago

You may tryout these one (I didn't tested it): sleef_VS2015.zip

RoyiAvital commented 6 years ago

@carlkl , I see in its name it is called VS 2015, does it mean it was compiled with VS 2015? As I already have compiled my own version with VS 2017. I was looking for compatibility with GCC (Without requirement for Unix Terminal Emulation).

carlkl commented 6 years ago

@RoyiAvital,

the libraries and the DLL are compiled with GCC https://github.com/shibatch/sleef/issues/172#issuecomment-369282789; the dynamic import library for sleef.dll is included in sleef_VS2015.zip and was created with the help of VS2015.

Hence two variant of static libraries are available for testing:

Both static library files are compiled with GCC, the latter one is archived by VS2015.

It would be good to get a feedback.

RoyiAvital commented 6 years ago

@carlkl ,

I previously created projects based on sleef.lib created by Visual Studio 2017.
The file was ~3.2 MB and was replaced by the sleef.lib (800 KB) in your sleef_VS2015.zip.

Then the project fails stating:

Error   LNK2019 unresolved external symbol Sleef_expf8_u10avx2 referenced in function main  MathLibAnalysis 

The same project works perfectly with the VS 2017 library.

Thank You.

carlkl commented 6 years ago

I now tested both variants of the static import libraries myself and failed with both (VS2015). I willl came up with a new one soon. For now you may try the dynamic import library for VS included in sleef_VS2015.zip to link against sleef.dll included in sleef-3.2_mingw-w64.zip. This worked for my in the small testprogramm given on the sleef.org website. The sleef.dll has to be placed alongside the simple_test.exe. I made some small changes at the programmm header.

#include <stdio.h>

/* SSE intrinsics for GCC and MSVC */
#if defined(_MSC_VER)
#include <intrin.h>
#elif defined(__GNUC__) && (defined(__x86_64__) || defined(__i386__))
#include <x86intrin.h>
#endif

/* Using SSE2 is default for MSVC 64bit. However __SSE2__ is not defined but needed by sleef.h */
#if defined(_MSC_VER)
#define __SSE2__
#endif
#include "sleef.h"

int main(int argc, char **argv) {
  double a[] = {2, 10};
  double b[] = {3, 20};

  __m128d va, vb, vc;

  va = _mm_loadu_pd(a);
  vb = _mm_loadu_pd(b);

  vc = Sleef_powd2_u10(va, vb);

  double c[2];

  _mm_storeu_pd(c, vc);

  printf("pow(%g, %g) = %g\n", a[0], b[0], c[0]);
  printf("pow(%g, %g) = %g\n", a[1], b[1], c[1]);
}

/*
usage:

./simple_test
pow(2, 3) = 8
pow(10, 20) = 1e+20
*/
shibatch commented 6 years ago

I uploaded compiled libraries for windows.

https://github.com/shibatch/sleef/releases/download/3.2/sleef-3.2-win.zip

RoyiAvital commented 6 years ago

@carlkl , Let me know when you have updated version of the Static Lib compiled with GCC. I'd be happy to check it out.

But the real deal is to have GCC support on CMake on Windows.