hughperkins / tf-coriander

OpenCL 1.2 implementation for Tensorflow
Apache License 2.0
791 stars 90 forks source link

`tf.truncated_normal` fails to run #44

Open ghost opened 7 years ago

ghost commented 7 years ago

I'm playing with midi-generative LSTMs, in code that's supposedly built for TF 0.10.x. I had to make some small modifications to get it to run, but I don't think those should have any bearing on the below error:

(Again, like #42, this isn't a high-priority bug for me)

(tf-cl) cathal@thinkum:~/Downloads/MusicGenerator$ python3 main.py --dataset_tag satie --model_tag satie
Welcome to DeepMusic v0.1 !

TensorFlow detected: v0.11.0rc0

Current parameters:
glob_step: 0
keep_all: False
dataset_tag: satie
sample_length: 40
hidden_size: 512
num_layers: 2
target_weights: linear
scheduled_sampling: none
batch_size: 64
save_every: 1000
ratio_dataset: 0.9
testing_curve: 10
batch_builder: relative
learning_rate: cst
enco_cell: identity
deco_cell: lstm
loop_processing: sample_softmax

Restoring dataset from /home/cathal/Downloads/MusicGenerator/data/samples/satie-relative.pkl...
Loaded: 18 songs (16 train/2 test)
Model creation...
OpenCL platform: AMD Accelerated Parallel Processing
OpenCL device: gfx803
I tensorflow/core/common_runtime/gpu/gpu_device.cc:989] Found device 0 with properties: 
name: gfx803
major: -1 minor: -1 memoryClockRate (GHz) 1266
pciBusID 0000.0000
Total memory: 8.00GiB
Free memory: 6.00GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:877] cannot enable peer access from device ordinal 0 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1011] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] 0:   N 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1083] Creating TensorFlow device (/gpu:0) -> (device: 0, name: gfx803, pci bus id: 0000.0000)
cl_driver DeviceAllocate 6120328192
Initialize variables...
fabs is called, but not defined
This is probalby a bug in Coriander. Please file an issue at https://github.com/hughperkins/coriander/issues/new
basicblockdumper.runGeneration got exception whilst processing:
  %373 = call double @fabs(double %372) #8

generateOpenCL failed to generate opencl sourcecode
kernel name orig=_ZN10tensorflow7functor28FillPhiloxRandomKernelLaunchINS_6random27TruncatedNormalDistributionINS2_19SingleSampleAdapterINS2_12PhiloxRandomEEEfEEEEvS5_PNT_17ResultElementTypeExS8_
kernel name short=_ZN10tensorflow7func
kernel name unique=_ZN10tensorflow7functor28FillPhiloxRandomKernelLaunchINS_6random27TruncatedNormalDistributionINS2_19SingleSampleAdapterINS2_12PhiloxRandomEEEfEEEEvS5_PNT_17ResultElementTypeExS8__0_1_2
writing ll to /tmp/failed-kernel.ll
caught runtime error fabs is called, but not defined => cannot continue.  Sorry :-(
terminate called after throwing an instance of 'std::runtime_error'
  what():  fabs is called, but not defined => cannot continue.  Sorry :-(
Aborted (core dumped)
hughperkins commented 7 years ago

Attempted to fix this in https://github.com/hughperkins/coriander/commit/5798bc437190f1b95e22b8393f512f64aa8d22c4

Can you clone the latest coriander (note: not tf-coriander, just the underlying compiler, it's much smaller, easy to build), build and install that please? :

~~# clone coriander cd ~ mkdir git cd git git clone https://github.com/hughperkins/coriander cd coriander mkdir build cd build~~ cmake .. ~~# press 'c', then 'g' (maybe 'e' too) make -j 8~~ python -c 'import tensorflow; import sys; print("\nPlease copy libcocl.so into:\n\n " + sys.modules["tensorflow"].path[0] + "/third_party/coriander\n")' # ^^^ check the instructions from this command seem reasonable, # then do this last bit, to install the latest libcocl.so, into your virtualenv

hughperkins commented 7 years ago

oh. seems like maybe htis just delays the pain? still get an error later on, right?

<program source>:403:12: error: assigning 'struct class_tensorflow__random__PhiloxRandom *' to '__global struct class_tensorflow__random__PhiloxRandom *' changes address space of pointer
    v64[0] = v39;
           ^ ~~~
hughperkins commented 7 years ago

hmmm, oh man, doubly-indirected pointer to struct :-O

global struct class_tensorflow__random__PhiloxRandom** v64;

I think that can wait till after split is solved. Doubly-indirected pointer to struct sounds quite hard...

ghost commented 7 years ago

Sure! Thanks, swift diagnosis.

If I pulled coriander to help debug, would I need CUDA to test things? I do not have CUDA... I have HIP though :P

On 4 June 2017 02:57:22 GMT+01:00, Hugh Perkins notifications@github.com wrote:

hmmm, oh man, doubly-indirected pointer to struct :-O

global struct class_tensorflow__random__PhiloxRandom** v64;

I think that can wait till after split is solved. That sounds quite hard...

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/tf-coriander/issues/44#issuecomment-306012485

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

hughperkins commented 7 years ago

No need for anything so drastic ;-) . I have a test case I can use to reproduce this issue:

from __future__ import print_function
import tensorflow as tf
import numpy as np
import pytest
import sys
from tensorflow.python.ops import array_ops

seed = 123

def _test_random_func(func_name, shape):
    print('func_name', func_name)
    func = eval(func_name)
    with tf.Graph().as_default():
        with tf.device('/cpu:0'):
            W_t = tf.Variable(func(shape, seed=seed))

            with tf.Session(config=tf.ConfigProto(log_device_placement=False)) as sess:
                sess.run(tf.initialize_all_variables())
                W_cpu = sess.run(W_t)
        with tf.device('/gpu:0'):
            W_t = tf.Variable(func(shape, seed=seed))

            with tf.Session(config=tf.ConfigProto(log_device_placement=False)) as sess:
                sess.run(tf.initialize_all_variables())
                W_gpu = sess.run(W_t)
            if np.prod(np.array(shape)) < 20:
                print('W_cpu', W_cpu)
                print('W_gpu', W_gpu)
            else:
                print('W_cpu.reshape(-1)[:20]', W_cpu.reshape(-1)[:20])
                print('W_gpu.reshape(-1)[:20]', W_gpu.reshape(-1)[:20])
            assert np.all(np.abs(W_cpu - W_gpu) < 1e-4)

_test_random_func('tf.truncated_normal', [3, 4])
hughperkins commented 7 years ago

(test added here: https://github.com/hughperkins/tf-coriander/commit/c157c9bfdfa928ad2de0cd5154c2f54c90394d33 currently marked skip, since it causes an abort)

ghost commented 7 years ago

So, for the coriander-uninitiated, how would I test this for you? :) Download Coriander from git, then...?

hughperkins commented 7 years ago

You dont need to do anything. It's a missing feature in Coriander. You can't do anything about this currently, except not use tf.truncated_normal.

ghost commented 7 years ago

Ha, OK no problem. I moved to another project meanwhile which uses RBMs and it's working so far. :)

On 4 June 2017 11:23:24 GMT+01:00, Hugh Perkins notifications@github.com wrote:

You dont need to do anything. It's a missing feature in Coriander. You can't do anything about this currently, except not use tf.truncated_normal.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/tf-coriander/issues/44#issuecomment-306031446

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

hughperkins commented 7 years ago

cool :-)

The other thing you could do is use eg tf.random_normal instead of tf.truncated_normal. You could truncate it yoursefl using tf.maximum and tf.minimum, I guess?

ghost commented 7 years ago

V. true! I have no problems monkey-patching stuff to get results. Feel free to close this one for now if you'd like to shelve it; it's not a priority for me as I've said, just a bug report for the sake of keeping track of bugs. :)

hughperkins commented 7 years ago

Let's leave it open please. I intend to get to it sooner or later. And it means other people can see this bug exists too.