yaroslavvb / tensorflow-community-wheels

Place to upload links to TensorFlow wheels
453 stars 35 forks source link

TensorFlow v1.11.0 CPU Only (march=core2, noAVX) Python 3.6.6, linux_x86_64 #86

Open Anacletus opened 6 years ago

Anacletus commented 6 years ago

Compiled with -march=core2 (without AVX) tensorflow-1.11.0-cp36-cp36m-linux_x86_64.whl

matus-pikuliak commented 5 years ago

Thanks, it works pretty well, but I have problems with tf.reduce_sum operations which does not work with tf.int32 inputs for some reason. The program fails with Illegal instruction (core dumped). Any ideas what might be causing it?

Anacletus commented 5 years ago

I can't reproduce the error on my computer, at least using eager execution:

import tensorflow as tf

tf.enable_eager_execution()
b1=tf.get_variable("b1",dtype='int32',shape=[22],initializer=tf.random_uniform_initializer(-1,1))
res=tf.reduce_sum(b1) 
res

runs ok and outputs: `

` I'm using cpu : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz Python: 3.6.6, And linux x86_64 Which is your cpu?. Maybe you can provide a short code to try to reproduce the error?
matus-pikuliak commented 5 years ago
import tensorflow as tf
a = tf.placeholder(dtype=tf.int32, shape=[None])
b = tf.reduce_sum(a)
sess = tf.Session()
print(sess.run(b, feed_dict={a: [1, 2, 3]}))

This fails for me. It works with float32, but also with int16 and int64. Other int32 operations such as addition also work. My CPU is AMD Phenom II N930, Python 3.6.6, Ubuntu 18.04 x86_64.

Anacletus commented 5 years ago

Your code works for me, and outputs: 6 I'm also running ubuntu 18.04, so I suppose it's cpu related. Maybe compilation with march=core2 generate instructions not available for your AMD?. Sorry.

evdcush commented 5 years ago

@matus-pikuliak I'm not sure what your issues are, but it is likely due to @Anacletus building this wheel for Intel core2 chip, whereas you have amd. Even if the ISA flags are similar, if I'm not mistaken, mtune and many other options are related to march.

I believe you would need to build tensorflow yourself on the AMD machine, or on another machine for -march=amdfam10 or -march=barcelona (and should look into specifying -opt=-mveclibabi="acml" at build) to have tensorflow work without issues. I've never built for AMD, but that's definitely what I'd try if I were in your situation.

These older builds are more important than "latest and greatest" GPU-optimized builds for current architecture, since vanilla TF does not support older chips at all anymore, so I'm sure other peeps would appreciate older AMD builds as well :+1:

JHorcasitas commented 5 years ago

Thank you! It worked perfectly!