SSE4.1/SSE4.2 instruction support?

LGTrader commented 7 years ago

I wouldn't rush on this but when I use tensorflow-9999 under Keras I see the following messages:

2017-06-05 12:38:57.041389: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2017-06-05 12:38:57.041412: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.

They appear to be listed in my /etc/portage/make.conf file as

CPU_FLAGS_X86="aes mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3"

but they aren't being used in the tensorflow build apparently. Should configure have set these up?

I sincerely doubt they would have any major speed advantage in my work today but who knows about the future or other users? I looked at the tensorflow install page but didn't see anything specific to turn these on. Maybe it's purely a gcc issue?

archenroot commented 7 years ago

Good catch - I will integrate CPU flags into ebuild. I think this is the right eclass: https://devmanual.gentoo.org/eclass-reference/flag-o-matic.eclass/index.html

Thanks. It is one of my target to enable all possible hardware acceleration for neural network frameworks if available.

LGTrader commented 7 years ago

Kewl. No rush but I will test when it's available.

I'm coming at tensorflow in an attempt to replace, under Linux, an old neural net program I've used for 15 years under Windows but now has no support any more from the company. I'm not sure whether tensorflow will be the best for my needs but I will be looking seriously at it, both in the CPU and GPU environments.

archenroot commented 6 years ago

I was on small vacation (3 weeks) and quite busy, but will try to look at this over weekend.

strelec commented 6 years ago

How was the weekend?

archenroot commented 6 years ago

Weekend was fine, but short :-))) I am leaving next week back home from Belgium to Czech republic for 3 weeks vacation, so I will focus on this. Also Amy (Gentoo dev) made an interest on Tensorflow, so we will work on that together to make it perfect piece.

archenroot / gentoo-overlay

SSE4.1/SSE4.2 instruction support? #23