TensorFlow v1.11.0 CPU Only (march=core2, noAVX) Python 3.6.6, linux_x86_64

Anacletus commented 6 years ago

Compiled with -march=core2 (without AVX) tensorflow-1.11.0-cp36-cp36m-linux_x86_64.whl

matus-pikuliak commented 5 years ago

Thanks, it works pretty well, but I have problems with tf.reduce_sum operations which does not work with tf.int32 inputs for some reason. The program fails with Illegal instruction (core dumped). Any ideas what might be causing it?

Anacletus commented 5 years ago

I can't reproduce the error on my computer, at least using eager execution:

import tensorflow as tf

tf.enable_eager_execution()
b1=tf.get_variable("b1",dtype='int32',shape=[22],initializer=tf.random_uniform_initializer(-1,1))
res=tf.reduce_sum(b1) 
res

runs ok and outputs: `

` I'm using cpu : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz Python: 3.6.6, And linux x86_64 Which is your cpu?. Maybe you can provide a short code to try to reproduce the error?

matus-pikuliak commented 5 years ago

import tensorflow as tf
a = tf.placeholder(dtype=tf.int32, shape=[None])
b = tf.reduce_sum(a)
sess = tf.Session()
print(sess.run(b, feed_dict={a: [1, 2, 3]}))

This fails for me. It works with float32, but also with int16 and int64. Other int32 operations such as addition also work. My CPU is AMD Phenom II N930, Python 3.6.6, Ubuntu 18.04 x86_64.

Anacletus commented 5 years ago

Your code works for me, and outputs: 6 I'm also running ubuntu 18.04, so I suppose it's cpu related. Maybe compilation with march=core2 generate instructions not available for your AMD?. Sorry.

evdcush commented 5 years ago

@matus-pikuliak I'm not sure what your issues are, but it is likely due to @Anacletus building this wheel for Intel core2 chip, whereas you have amd. Even if the ISA flags are similar, if I'm not mistaken, mtune and many other options are related to march.

I believe you would need to build tensorflow yourself on the AMD machine, or on another machine for -march=amdfam10 or -march=barcelona (and should look into specifying -opt=-mveclibabi="acml" at build) to have tensorflow work without issues. I've never built for AMD, but that's definitely what I'd try if I were in your situation.

Figure out which version GCC you are working with
Look up the "GNU Compiler Collection (GCC): x86 Options" documentation for your GCC version (GCC 7.3 docs)
Find the march flags and all other AMD related stuff in there (amdfam10 in your case, I believe, read more at https://en.wikipedia.org/wiki/AMD_10h#%22Champlain%22_(45nm_SOI,_Quad-core) and https://en.wikipedia.org/wiki/List_of_AMD_Phenom_microprocessors#%22Champlain%22_(45_nm,_Quad-core))
do a custom build of TF, leaving default config=opt as -march=native if you are building from the machine in question, otherwise specify your march

These older builds are more important than "latest and greatest" GPU-optimized builds for current architecture, since vanilla TF does not support older chips at all anymore, so I'm sure other peeps would appreciate older AMD builds as well :+1:

JHorcasitas commented 5 years ago

Thank you! It worked perfectly!

yaroslavvb / tensorflow-community-wheels

TensorFlow v1.11.0 CPU Only (march=core2, noAVX) Python 3.6.6, linux_x86_64 #86