keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.88k stars 19.45k forks source link

Some errors when running variational_autoencoder_deconv example. #6195

Closed Imorton-zd closed 7 years ago

Imorton-zd commented 7 years ago

The variational_autoencoder_deconv example is from keras version 1.2.2 The errors:

RuntimeError: GpuDnnConvGradI: error getting worksize: CUDNN_STATUS_BAD_PARAM
Apply node that caused the error: GpuDnnConvGradI{algo='none', inplace=True}(GpuContiguous.0, GpuContiguous.0, GpuAllocEmpty.0, GpuDnnConvDesc{border_mode='half', subsample=(1, 1), conv_mode='conv', precision='float32'}.0, Constant{1.0}, Constant{0.0})
Toposort index: 356
Inputs types: [CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, 4D), <theano.gof.type.CDataType object at 0x000000018E212B00>, Scalar(float32), Scalar(float32)]
Inputs shapes: [(64, 64, 3, 3), (100, 64, 14, 14), (100, 64, 14, 64), 'No shapes', (), ()]
Inputs strides: [(576, 9, 3, 1), (12544, 196, 14, 1), (57344, 896, 64, 1), 'No strides', (), ()]
Inputs values: ['not shown', 'not shown', 'not shown', <PyCObject object at 0x00000001A9FB39B8>, 1.0, 0.0]
Inputs name: ('kernel', 'grad', 'output', 'descriptor', 'alpha', 'beta')

Outputs clients: [[GpuDimShuffle{0,2,3,1}(GpuDnnConvGradI{algo='none', inplace=True}.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Please give me some suggestions. If possible, is there some code snippets for semi-supervised learning with VAE. Thanks.

fchollet commented 7 years ago

You should upgrade to version 2.0.2 (latest). The example works fine.

Imorton-zd commented 7 years ago

I have upgraded to version 2.0.2 (latest), but:

WARNING:theano.gof.compilelock:Overriding existing lock by dead process '6156' (I am process '13044')
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (100, 28, 28, 1)      0                                            
____________________________________________________________________________________________________
conv2d_1 (Conv2D)                (100, 28, 28, 1)      5                                            
____________________________________________________________________________________________________
conv2d_2 (Conv2D)                (100, 14, 14, 64)     320                                          
____________________________________________________________________________________________________
conv2d_3 (Conv2D)                (100, 14, 14, 64)     36928                                        
____________________________________________________________________________________________________
conv2d_4 (Conv2D)                (100, 14, 14, 64)     36928                                        
____________________________________________________________________________________________________
flatten_1 (Flatten)              (100, 12544)          0                                            
____________________________________________________________________________________________________
dense_1 (Dense)                  (100, 128)            1605760                                      
____________________________________________________________________________________________________
dense_2 (Dense)                  (100, 2)              258                                          
____________________________________________________________________________________________________
dense_3 (Dense)                  (100, 2)              258                                          
____________________________________________________________________________________________________
lambda_1 (Lambda)                (100, 2)              0                                            
____________________________________________________________________________________________________
dense_4 (Dense)                  (100, 128)            384                                          
____________________________________________________________________________________________________
dense_5 (Dense)                  (100, 12544)          1618176                                      
____________________________________________________________________________________________________
reshape_1 (Reshape)              (100, 14, 14, 64)     0                                            
____________________________________________________________________________________________________
conv2d_transpose_1 (Conv2DTransp (100, 14, 14, 64)     36928                                        
____________________________________________________________________________________________________
conv2d_transpose_2 (Conv2DTransp (100, 14, 14, 64)     36928                                        
____________________________________________________________________________________________________
conv2d_transpose_3 (Conv2DTransp (100, 29, 29, 64)     36928                                        
____________________________________________________________________________________________________
conv2d_5 (Conv2D)                (100, 28, 28, 1)      257                                          
====================================================================================================
Total params: 3,410,058
Trainable params: 3,410,058
Non-trainable params: 0
____________________________________________________________________________________________________
('x_train.shape:', (60000L, 28L, 28L, 1L))
Using gpu device 0: GeForce GTX 750 (CNMeM is enabled with initial size: 80.0% of memory, cuDNN 5005)
Traceback (most recent call last):

  File "<ipython-input-1-80080090c6dc>", line 1, in <module>
    runfile('E:/DL EX/K/keras-master-2017-4-7/examples/variational_autoencoder_deconv.py', wdir='E:/DL EX/K/keras-master-2017-4-7/examples')

  File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 699, in runfile
    execfile(filename, namespace)

  File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "E:/DL EX/K/keras-master-2017-4-7/examples/variational_autoencoder_deconv.py", line 136, in <module>
    validation_data=(x_test, x_test))

  File "C:\Anaconda2\lib\site-packages\keras\engine\training.py", line 1427, in fit
    self._make_test_function()

  File "C:\Anaconda2\lib\site-packages\keras\engine\training.py", line 1022, in _make_test_function
    **self._function_kwargs)

  File "C:\Anaconda2\lib\site-packages\keras\backend\theano_backend.py", line 1132, in function
    return Function(inputs, outputs, updates=updates, **kwargs)

  File "C:\Anaconda2\lib\site-packages\keras\backend\theano_backend.py", line 1118, in __init__
    **kwargs)

  File "C:\Anaconda2\lib\site-packages\theano\compile\function.py", line 326, in function
    output_keys=output_keys)

  File "C:\Anaconda2\lib\site-packages\theano\compile\pfunc.py", line 486, in pfunc
    output_keys=output_keys)

  File "C:\Anaconda2\lib\site-packages\theano\compile\function_module.py", line 1808, in orig_function
    defaults)

  File "C:\Anaconda2\lib\site-packages\theano\compile\function_module.py", line 1674, in create
    input_storage=input_storage_lists, storage_map=storage_map)

  File "C:\Anaconda2\lib\site-packages\theano\gof\link.py", line 699, in make_thunk
    storage_map=storage_map)[:3]

  File "C:\Anaconda2\lib\site-packages\theano\gof\vm.py", line 1047, in make_all
    impl=impl))

  File "C:\Anaconda2\lib\site-packages\theano\gof\op.py", line 935, in make_thunk
    no_recycling)

  File "C:\Anaconda2\lib\site-packages\theano\gof\op.py", line 839, in make_c_thunk
    output_storage=node_output_storage)

  File "C:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1190, in make_thunk
    keep_lock=keep_lock)

  File "C:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1131, in __compile__
    keep_lock=keep_lock)

  File "C:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1586, in cthunk_factory
    key=key, lnk=self, keep_lock=keep_lock)

  File "C:\Anaconda2\lib\site-packages\theano\gof\cmodule.py", line 1159, in module_from_key
    module = lnk.compile_cmodule(location)

  File "C:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1489, in compile_cmodule
    preargs=preargs)

  File "C:\Anaconda2\lib\site-packages\theano\sandbox\cuda\nvcc_compiler.py", line 417, in compile_str
    return dlimport(lib_filename)

  File "C:\Anaconda2\lib\site-packages\theano\gof\cmodule.py", line 302, in dlimport
    rval = __import__(module_name, {}, {}, [module_name])

RuntimeError: ('The following error happened while compiling the node', GpuDnnConv{algo='small', inplace=True}(GpuContiguous.0, GpuContiguous.0, GpuAllocEmpty.0, GpuDnnConvDesc{border_mode='half', subsample=(1, 1), conv_mode='conv', precision='float32'}.0, Constant{1.0}, Constant{0.0}), '\n', 'could not create cuDNN handle: CUDNN_STATUS_NOT_INITIALIZED', "[GpuDnnConv{algo='small', inplace=True}(<CudaNdarrayType(float32, 4D)>, <CudaNdarrayType(float32, 4D)>, <CudaNdarrayType(float32, 4D)>, <CDataType{cudnnConvolutionDescriptor_t}>, Constant{1.0}, Constant{0.0})]")
dbl001 commented 7 years ago

Different error on Tensorflow with variational_autoencoder_deconv.py:

/Users/davidlaxer/anaconda/lib/python2.7/site-packages/keras/engine/topology.py:1519: UserWarning: Model inputs must come from a Keras Input layer, they cannot be the output of a previous non-Input layer. Here, a tensor specified as input to "model_2" was not an Input tensor, it was generated by layer custom_variational_layer_2.
Note that input tensors are instantiated via `tensor = Input(shape)`.
The tensor that caused the issue was: input_2:0
  str(x.name))
/Users/davidlaxer/anaconda/lib/python2.7/site-packages/ipykernel/__main__.py:134: UserWarning: Output "custom_variational_layer_2" missing from loss dictionary. We assume this was done on purpose, and we will not be expecting any data to be passed to "custom_variational_layer_2" during training.
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_2 (InputLayer)             (None, 28, 28, 1)     0                                            
____________________________________________________________________________________________________
conv2d_1 (Conv2D)                (None, 28, 28, 1)     5                                            
____________________________________________________________________________________________________
conv2d_2 (Conv2D)                (None, 14, 14, 64)    320                                          
____________________________________________________________________________________________________
conv2d_3 (Conv2D)                (None, 14, 14, 64)    36928                                        
____________________________________________________________________________________________________
conv2d_4 (Conv2D)                (None, 14, 14, 64)    36928                                        
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 12544)         0                                            
____________________________________________________________________________________________________
dense_6 (Dense)                  (None, 128)           1605760                                      
____________________________________________________________________________________________________
dense_7 (Dense)                  (None, 2)             258                                          
____________________________________________________________________________________________________
dense_8 (Dense)                  (None, 2)             258                                          
____________________________________________________________________________________________________
lambda_2 (Lambda)                (None, 2)             0                                            
____________________________________________________________________________________________________
dense_9 (Dense)                  (None, 128)           384                                          
____________________________________________________________________________________________________
dense_10 (Dense)                 (None, 12544)         1618176                                      
____________________________________________________________________________________________________
reshape_1 (Reshape)              (None, 14, 14, 64)    0                                            
____________________________________________________________________________________________________
conv2d_transpose_1 (Conv2DTransp (None, 14, 14, 64)    36928                                        
____________________________________________________________________________________________________
conv2d_transpose_2 (Conv2DTransp (None, 14, 14, 64)    36928                                        
____________________________________________________________________________________________________
conv2d_transpose_3 (Conv2DTransp (None, 29, 29, 64)    36928                                        
____________________________________________________________________________________________________
conv2d_5 (Conv2D)                (None, 28, 28, 1)     257                                          
____________________________________________________________________________________________________
custom_variational_layer_2 (Cust [(None, 28, 28, 1), ( 0                                            
====================================================================================================
Total params: 3,410,058
Trainable params: 3,410,058
Non-trainable params: 0
____________________________________________________________________________________________________
('x_train.shape:', (60000, 28, 28, 1))
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-f1e7cd023c75> in <module>()
    149         epochs=epochs,
    150         batch_size=batch_size,
--> 151         validation_data=(x_test, None))
    152 
    153 # build a model to project inputs on the latent space

/Users/davidlaxer/anaconda/lib/python2.7/site-packages/keras/engine/training.pyc in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)
   1484                               val_f=val_f, val_ins=val_ins, shuffle=shuffle,
   1485                               callback_metrics=callback_metrics,
-> 1486                               initial_epoch=initial_epoch)
   1487 
   1488     def evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None):

/Users/davidlaxer/anaconda/lib/python2.7/site-packages/keras/engine/training.pyc in _fit_loop(self, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch)
   1139                 batch_logs['size'] = len(batch_ids)
   1140                 callbacks.on_batch_begin(batch_index, batch_logs)
-> 1141                 outs = f(ins_batch)
   1142                 if not isinstance(outs, list):
   1143                     outs = [outs]

/Users/davidlaxer/anaconda/lib/python2.7/site-packages/keras/backend/tensorflow_backend.pyc in __call__(self, inputs)
   2098                                           np.expand_dims(sparse_coo.col, 1)), 1)
   2099                 value = (indices, sparse_coo.data, sparse_coo.shape)
-> 2100             feed_dict[tensor] = value
   2101         session = get_session()
   2102         updated = session.run(self.outputs + [self.updates_op],

TypeError: unhashable type: 'list'
krzakala commented 6 years ago

Same error on tensorflow backhand, with keras 2.0.2