Tencent / PhoenixGo

Go AI program which implements the AlphaGo Zero paper
Other
2.88k stars 577 forks source link

AVX + AVX2 + FMA + AVX512 : Windows release does not support these instructions, please release windows precompiled versions for modern computers #74

Open wonderingabout opened 5 years ago

wonderingabout commented 5 years ago

Can you release 2 windows releases for modern computers please ?

Computer from google cloud : Windows Server 2016 Tesla V100 Intel Xeon Phi (supports AVX512F)

cpu instructionsv2

after that the games are played no problem, but it is slower because no AVX / AVX2 / FMA / AVX512F

on ubuntu no problem to use these modern CPU instructions

Since tensorRT does not support batch size higher than 4, there is little benefit to use ubuntu instead of windows (harder to configure too) So i think these 2 releases for windows and mac would be greatly appreciated :+1:

can you make these 2 releases for windows (and mac) ?

big thanks ! @wodesuck

wodesuck commented 5 years ago

I would try building an avx/avx2 version. But since I don't have any PC with avx512, you may need to build it yourself if really want.

wonderingabout commented 5 years ago

avx/avx2 should cover most usages

for avx512, i use avx512 in ubuntu 16.04 after compiling with bazel, so no i dont need it i just thought it was a good idea to add it

most important is avx/avx2 release for windows (and mac)

wonderingabout commented 5 years ago

any update on avx/avx2 builds for windows and mac @wodesuck ?

wodesuck commented 5 years ago

Not yet. I got some problem while building, but don't have time to fix it yet.

wonderingabout commented 5 years ago

ok, when you try this again, can you also add fma too ?

(my r7 1700 has avx/avx2/fma support)

wonderingabout commented 5 years ago

@fiskerhuang @funionguo

wonderingabout commented 5 years ago

some benchmark on my gtx1060 :

it is much slower than on ubuntu here 3000 sims per move with default settings (all time manage settings disabled)

47s per move versus 21s per move on ubuntu with tensorrt (and arround 25s per move without tensorrt)

see : avxv1 avxv2

compare it to my ubuntu benchmarks (4000 sims per move with same settings) here : https://github.com/wonderingabout/PhoenixGo/blob/faqv2-bazel-master/docs/benchmark-gtx1060.md

i understand that it will take time, and i am not asking the developers to hurry,

but the windows release is indeed much slower than the linux release, due to the lack of avx avx2 fma (and possibly also because nvidia drivers are said to be arround 10% more powerful on ubuntu)