Open Anacletus opened 6 years ago
Thanks, it works pretty well, but I have problems with tf.reduce_sum
operations which does not work with tf.int32
inputs for some reason. The program fails with Illegal instruction (core dumped)
. Any ideas what might be causing it?
I can't reproduce the error on my computer, at least using eager execution:
import tensorflow as tf
tf.enable_eager_execution()
b1=tf.get_variable("b1",dtype='int32',shape=[22],initializer=tf.random_uniform_initializer(-1,1))
res=tf.reduce_sum(b1)
res
runs ok and outputs: `
import tensorflow as tf
a = tf.placeholder(dtype=tf.int32, shape=[None])
b = tf.reduce_sum(a)
sess = tf.Session()
print(sess.run(b, feed_dict={a: [1, 2, 3]}))
This fails for me. It works with float32
, but also with int16
and int64
. Other int32
operations such as addition also work. My CPU is AMD Phenom II N930, Python 3.6.6, Ubuntu 18.04 x86_64.
Your code works for me, and outputs:
6
I'm also running ubuntu 18.04, so I suppose it's cpu related. Maybe compilation with march=core2 generate instructions not available for your AMD?. Sorry.
@matus-pikuliak I'm not sure what your issues are, but it is likely due to @Anacletus building this wheel for Intel core2 chip, whereas you have amd. Even if the ISA flags are similar, if I'm not mistaken, mtune
and many other options are related to march
.
I believe you would need to build tensorflow yourself on the AMD machine, or on another machine for -march=amdfam10
or -march=barcelona
(and should look into specifying -opt=-mveclibabi="acml"
at build) to have tensorflow work without issues. I've never built for AMD, but that's definitely what I'd try if I were in your situation.
march
flags and all other AMD related stuff in there (amdfam10
in your case, I believe, read more at https://en.wikipedia.org/wiki/AMD_10h#%22Champlain%22_(45nm_SOI,_Quad-core) and https://en.wikipedia.org/wiki/List_of_AMD_Phenom_microprocessors#%22Champlain%22_(45_nm,_Quad-core))These older builds are more important than "latest and greatest" GPU-optimized builds for current architecture, since vanilla TF does not support older chips at all anymore, so I'm sure other peeps would appreciate older AMD builds as well :+1:
Thank you! It worked perfectly!
Compiled with -march=core2 (without AVX) tensorflow-1.11.0-cp36-cp36m-linux_x86_64.whl