robbmcleod / pyfastnoisesimd

Python module wrapping C++ FastNoiseSIMD
BSD 3-Clause "New" or "Revised" License
39 stars 6 forks source link

Compile with per-file flags #9

Closed QuLogic closed 6 years ago

QuLogic commented 6 years ago

Instead of doing feature detection, compile all files and let the underlying library do feature detection at runtime. As I mentioned in the other ticket, I could not find a way builtin to distutils/setuptools to do this automatically, so I have to patch the compile method to compile files individually with extra flags as necessary.

This is WIP only insomuch as there's no way to turn off any of the default flags. This is only necessary if your compiler does not support the flags, so I'm not sure how necessary that is. (Turning on flags can be done by setting CFLAGS as usual.)

Note, this is based on my two other PRs. I can rebase those out when they're merged.

Fixes #7

QuLogic commented 6 years ago

Hrm, well, since Travis failed, I guess I still need to be able to turn off the AVX512 flags, at the very least.

robbmcleod commented 6 years ago

What are you trying to do, build conda-forge repos or something?

It's a general problem with wheels that they support limiting on address size, OS, and Python version, but not SIMD levels. There's no Python stdlib support for SIMD detection either. So I wrote a quick replacement for cpuinfo.py at https://github.com/robbmcleod/cpufeature but it will need a lot of testing.

The notion of cross-compiling SSE4.1 and AVX2 is fine for x64 wheels. For x86 you can just do SSE4.1. I would be fine for making that the default behavior for bdist_wheel.

Neon ARM flags are on or not, that's a totally different architecture.

TravisCI's Linux workers don't seem to get along with SIMD instructions. I've seen this problem also with python-blosc for example, where they randomly don't work. You can restart them and sometimes they do complete. They're ok for MacOSX but for Linux I suspect another service is required. CircleCI is maybe an option but I haven't had the time to learn their system.

QuLogic commented 6 years ago

Not conda-forge packages, but Fedora.

Theoretically, there should be no problem with SIMD level, because FastNoiseSIMD can dispatch to the correct implementation at runtime. The main problem is only if the compiler doesn't support it at all.

I don't think you can assume that 32-bit is only SSE4.1; I can compile in 32-bit mode, run it on a 64-bit system and still get SIMD level 3 (AVX2).

robbmcleod commented 6 years ago

I raised an issue with Travis, we'll see where it leads:

https://github.com/travis-ci/travis-ci/issues/9006

QuLogic commented 6 years ago

I believe this does what I want it to, compiling each instruction-set-specific file with its own flag. It's also auto-detected based on compiler flag support which ones should be build, or you can be explicit by passing, e.g., --with-avx512=yes, to build (though only if your compiler supports it.)

Still appear to be hitting that Travis bug though.

robbmcleod commented 6 years ago

Hi Elliott,

Can we accept this as is for the moment? I would like to make a new release and I suspect it will take a bit of time to resolve the Travis issues or swap over to another service.

QuLogic commented 6 years ago

I think it works for me at least. On Travis, it's doing what I expect, checking for AVX512 compiler support, and disabling it because it's not supported. The build failure is later, as far as I can tell, and shouldn't be because of this change.

If it works for you too, I think this is good.

robbmcleod commented 6 years ago

Ok thanks for your help. I'll see about this CI issue.