explosion / cython-blis

💥 Fast matrix-multiplication as a self-contained Python library – no system dependencies!
Other
219 stars 37 forks source link

add auto-generated blis.h for windows/generic target #67

Closed niyas-sait closed 2 years ago

niyas-sait commented 2 years ago

This change adds missing blis.h from flame\include\generic\blis.h for windows/generic target.

The issue has been identified while compiling binary wheels for windows/arm64 and the builds were failing with following error message:

clang.exe -c C:\Users\niysai01\Workspace\cython-blis\blis\_src\config\generic\bli_cntx_init_generic.c -o C:\Users\niysai01\AppData\Local\Temp\tmpolhvgz5e\bli_cntx_init_generic.o -O3 -std=c99 -D_POSIX_C_SOURCE=200112L -DBLIS_VERSION_STRING="0.7.0" -DBLIS_IS_BUILDING_LIBRARY -Iinclude\windows-generic -I.\frame\3\ -I.\frame\ind\ukernels\ -I.\frame\3\ -I.\frame\1m\ -I.\frame\1f\ -I.\frame\1\ -I.\frame\include -IC:\Users\niysai01\Workspace\cython-blis\blis\_src\include\windows-generic
In file included from C:\Users\niysai01\Workspace\cython-blis\blis\_src\config\generic\bli_cntx_init_generic.c:35:
.\frame\include\blis.h:59:10: fatal error: 'bli_config.h' file not found

This change fixes the above build error and we can successfully build a generic package for windows.

niyas-sait commented 2 years ago

@adrianeboyd Could you please review this patch ?

adrianeboyd commented 2 years ago

We'd appreciate it if you did not @ individual maintainers (we already get notifications for everything and it just makes our mentions hard to use) and please have more than two working days' patience before bumping threads.

In any case, thanks for the PR! We can add generic here, but this is going to be very slow for any future users due to the lack of auto-vectorization. I think that it would be better to add the cortexa57 or arm64 target in addition (or perhaps instead, although generic may still be useful for testing purposes). Until the next blis release, at least, I think one of these is going to be the best option? We can't really test it on our end, though, so I don't know for sure. Daniël de Kok has some benchmarking code here:

https://github.com/danieldk/gemm-benchmark

And if you can help a bit with testing, we can potentially update the automatic platform detection in setup.py, too, so users don't have to specify BLIS_ARCH=blah pip install ... for windows non-x86_64.

https://github.com/explosion/cython-blis/blob/80965fae1764a17d7f84836646c631c65fad5817/setup.py#L129-L141

niyas-sait commented 2 years ago

Thanks, @adrianeboyd for the review. And apologies for tagging you.

I've looked at adding cortexa57 or arm64 specific target for windows, I haven't managed to compile with that yet, and looks like it requires a bit more work. I guess we could stick with generic for now until cortexa57 support is added.

I've added the automatic platform detection for windows as you suggest. I've kept generic for any non-x86_64 platforms for now.

adrianeboyd commented 2 years ago

Hmm, it looks like the detection isn't working as intended and all windows builds are using generic now. Are you sure that platform.machine() returns x86_64 in windows? (I actually don't have any local windows machines, so I can't test this easily. One of the other developers on our team gets AMD64 but I don't know if this is consistent.)

You can search for BLIS_ARCH in the CI logs under "Build wheel" to see what it's chosen.

Maybe it would be better to choose generic for arm64 (or whatever hopefully predictable value you get here) and continue using x86_64 by default otherwise for windows? (There aren't any other options, are there?)

niyas-sait commented 2 years ago

Hmm, it looks like the detection isn't working as intended and all windows builds are using generic now. Are you sure that platform.machine() returns x86_64 in windows? (I actually don't have any local windows machines, so I can't test this easily. One of the other developers on our team gets AMD64 but I don't know if this is consistent.)

You can search for BLIS_ARCH in the CI logs under "Build wheel" to see what it's chosen.

Maybe it would be better to choose generic for arm64 (or whatever hopefully predictable value you get here) and continue using x86_64 by default otherwise for windows? (There aren't any other options, are there?)

Sorry I missed that. Looks like possible values for platform.machine() on windows are AMD64, IA64, x86, ARM64. I guess it is probably better to go with generic for arm64 and default to x86_64

niyas-sait commented 2 years ago

Looks like BUILD ARCH is correctly set to x86_64 now ( CI log )

I've tested Arm64 locally and it is correctly using generic target

adrianeboyd commented 2 years ago

Thanks! Let me see if I can additionally modify one CI test so that we're testing generic explicitly...

adrianeboyd commented 2 years ago

Nevermind, we can do it separately.

niyas-sait commented 2 years ago

Thanks! Let me see if I can additionally modify one CI test so that we're testing generic explicitly...

Yes, sounds like a good idea.

niyas-sait commented 2 years ago

Nevermind, we can do it separately.

@adrianeboyd Shall I add the CI changes to this patch or Is it better done in a separate patch?

adrianeboyd commented 2 years ago

I think it's okay to do it separately. We're not actually explicitly testing generic for the other platforms, either.