microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.49k stars 4.3k forks source link

CNTK has supporting issues with `GRU (unroll=true)` #3800

Open shiningrain opened 4 years ago

shiningrain commented 4 years ago

System information

Describe the current behavior

When I try to build a model with layer GRU(parameter unroll=True) on CNTK, it raises the following error incntk\ops\__init__.py line 2743. For detailed parameters of GRU, you can refer to the following code snippet.

`RuntimeError: Function 'Slice: Output('Plus237_Output_0', [#], [2 x 6]) -> Unknown': Slice operation index range [2,4), interpreted as [2,4), is invalid for input 'Output('Plus237_Output_0', [#], [2 x 6])' shape '[2 x 6]'.`

It seems that CNTK cannot support GRU well, when unroll=True. But I didn't find any warning or description about the specific parameters leading to this kind of problem in the documents. I wonder what happened in the CNTK when processing GRU (unroll=true). In addition, this unexpected problem may confuse CNTK users.

Code to reproduce the issue

import numpy as np
import keras.layers as L
from keras.engine import Model, Input

## Using CNTK as Keras backend.
## Input dtype default is float32

kwargs = {
    'units': 2, 
    'dropout': 0.20430343923336958, 
    'recurrent_dropout': 0.7597739154146002, 
    'implementation': 2, 
    'reset_after': True, 
    'use_bias': True, 
    'return_sequences': False, 
    'return_state': False, 
    'go_backwards': False, 
    'stateful': True, 
    'unroll': True
}

input_data = (10 * np.random.random((2,10,8)))
input = input_data.astype('float32')
layer = L.recurrent.GRU(**kwargs)
x = Input(batch_shape=input.shape)
y = layer(x)
bk_model = Model(x, y)
print('finish')
delzac commented 4 years ago

The unroll feature comes from keras. CNTK doesn't have any unroll feature.

shiningrain commented 4 years ago

The unroll feature comes from keras. CNTK doesn't have any unroll feature.

Thank you very much for your reply! If you are sure that unroll is not related to CNTK, then this issue should be attributed to the flaw in Keras's implementation of this operation. I have submitted a similar issue to Keras. Thank you again!