igul222 / improved_wgan_training

Code for reproducing experiments in "Improved Training of Wasserstein GANs"
MIT License
2.35k stars 670 forks source link

CPU BiasOp only supports NHWC. #11

Open rafaelvalle opened 7 years ago

rafaelvalle commented 7 years ago

Any solution on the error below? E tensorflow/core/common_runtime/executor.cc:594] Executor failed to create kernel. Invalid argument: CPU BiasOp only supports NHWC.

rafaelvalle commented 7 years ago

Can be fixed by changing below from NCHW to the CPU compatible NHWC.

./tflib/ops/batchnorm.py:            return tf.nn.fused_batch_norm(inputs, scale, offset, epsilon=1e-5, data_format='NCHW')
./tflib/ops/batchnorm.py:            #     data_format='NCHW'
Binary file ./tflib/ops/batchnorm.pyc matches
./tflib/ops/conv1d.py:            data_format='NCHW'
./tflib/ops/conv1d.py:            result = tf.nn.bias_add(result, _biases, data_format='NCHW')
./tflib/ops/conv2d.py:            data_format='NCHW'
./tflib/ops/conv2d.py:            result = tf.nn.bias_add(result, _biases, data_format='NCHW')
Binary file ./tflib/ops/conv2d.pyc matches
./tflib/ops/deconv2d.py:        inputs = tf.transpose(inputs, [0,2,3,1], name='NCHW_to_NHWC')
./tflib/ops/deconv2d.py:        result = tf.transpose(result, [0,3,1,2], name='NHWC_to_NCHW')
Binary file ./tflib/ops/deconv2d.pyc matches
igul222 commented 7 years ago

It seems like you're trying to run the model without a GPU. Is there any reason you want to do this? I think it would be too slow to be practical on a CPU.

rafaelvalle commented 7 years ago

I need to run the MNIST experiments on this repo and can't do it on my GPU because it's busy running some other experiments. It might also be that people without GPUs want to run the toy example or MNIST...

zackchase commented 7 years ago

Thanks for creating this repo. I agree with rafaelvalle - it seems unreasonable that the code should break entirely if not using GPU...

yanxiang007 commented 7 years ago

Totally agree. This is extremely helpful for me who can't afford a GPU and want to run this code as a toy example...

yanxiang007 commented 7 years ago

However, in order to make this code run through in CPU mode, I have to make extra changes apart from the changes mentioned by rafaelvalle. For example, in conv1d.py, I have to change result = tf.expand_dims(result, 3) to result = tf.expand_dims(result, **1**) since the data format is changed from 'NCHW' to 'NHWC'. As I am pretty new to DL and TF, I am struggling to understand the data structure, and the changes I made to make this code run through are pretty ugly. Anyway, I am asking my boss to buy a GPU so that I can get rid of these headaches...

briland commented 7 years ago

In order to run MNIST experiments on CPU in wgan-gp mode, beside performing the changes suggested by @rafaelvalle , it is necessary to also perform the following changes:

georgiazhang commented 7 years ago

Hi, I'm running gan_laguage.py, after changing all the NCHW to the CPU compatible NHWC, i still get problem as follows:

ValueError: Dimensions must be equal, but are 32 and 512 for 'Generator.1.1/conv1d/Conv2D' (op: 'Conv2D') with input shapes: [64,1,512,32], [1,5,512,512].

Can anyone tell me how to fix the problem? Thank you very much!

MLEnthusiast commented 7 years ago

@georgiazhang I am also facing a similar problem.

@igul222 , how can we mitigate this while running on CPU?

MLEnthusiast commented 7 years ago

@rafaelvalle , how long does it take for each iteration in case of MNIST training on CPU? For me it is taking about 12s. Is it too high for one mini-batch?

kaiyu-tang commented 7 years ago

@georgiazhang I am facing the same problem. Do you have fixed this issue?

Samt7 commented 6 years ago

@kaiyu-tang @MLEnthusiast @georgiazhang I am facing the same problem. Do you have fixed this issue?

nickyoungforu commented 6 years ago

@kaiyu-tang @MLEnthusiast @georgiazhang @Samt7 I am facing the same problem. Do you have fixed this issue?

1213999170 commented 6 years ago

@kaiyu-tang @MLEnthusiast @georgiazhang @Samt7 @nickyoungforu I am facing the same problem. Do you have fixed this issue?

nickyoungforu commented 6 years ago

@1213999170 i modified tensors shape after changed data_format=NHWC: ./tflib/ops/conv1d.py: 104 add result = tf.transpose(result, [0, 1, 3, 2]) ./gan_language.py: 71 # output = tf.transpose(output, [0, 2, 1]) ./gan_language.py: 76 # output = tf.transpose(inputs, [0,2,1])

tracy20180426 commented 5 years ago

Conv2DCustomBackpropInputOp only supports NHWC.

tracy20180426 commented 5 years ago

image I want to know how to solve this problems ,can you tell me ?