apple / tensorflow_macos

TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
Other
3.66k stars 308 forks source link

Keyword arguments "activation", "recurrent_activation" and "recurrent_dropout" for layers.GRU causes InvalidArgumentError during training #170

Closed atw1020 closed 3 years ago

atw1020 commented 3 years ago

Description

When GRU units are initialized with non-default values of "activation", "recurrent_activation" and "recurrent_dropout" causes a tf.errors.InvalidArgumentError during training.

Error

 Inputs to operation gradient_tape/model_1/gru_1/while/model_1/gru_1/while_grad/body/_174/gradient_tape/model_1/gru_1/while/gradients/AddN_1 of type AddN must have the same size and shape.  Input 0: [1,256] != input 1: []
     [[{{node gradient_tape/model_1/gru_1/while/model_1/gru_1/while_grad/body/_174/gradient_tape/model_1/gru_1/while/gradients/AddN_1}}]] [Op:__inference_train_function_4694]

Workaround

If you experience this issue yourself, the only workaround I have found is to only use the default activation, recurrent activation and recurrent dropout. Default activation functions are a minor concern but the inability to use dropout on recurrent GRUs is a big limitation.

Code to Reproduce

"""

Author: Arthur Wesley

"""

import tensorflow as tf

from tensorflow.keras import layers
from tensorflow.keras.models import Model

x = tf.random.uniform(shape=(1, 5, 512))
y = tf.random.uniform(shape=(1, 256))

# test GRUs

input_layer = layers.Input(shape=(None, 512))

RNNs = [layers.GRU(256), # works
        layers.GRU(256,
                   activation="relu"), # fails
        layers.GRU(256,
                   dropout=0.2), # works
        layers.GRU(256,
                   recurrent_activation="relu"), # fails
        layers.GRU(256,
                   recurrent_dropout=0.2)] # fails

        # layers.LSTM(256,
        #             activation="relu"), # gives a different issue (link)]

for RNN in RNNs:

    RNN = RNN(input_layer)

    model = Model(inputs=input_layer,
                  outputs=RNN)
    model.compile(optimizer="adam",
                  loss="mse")

    try:
        model.fit(x, y,
                  verbose=0)
        print(RNN.name, "succeeded")
    except tf.errors.InvalidArgumentError as E:
        print(RNN.name, "failed")

Expected Erroneous Output

gru/PartitionedCall:0 succeeded
gru_1/strided_slice_3:0 failed
gru_2/PartitionedCall:0 succeeded
gru_3/strided_slice_3:0 failed
gru_4/strided_slice_3:0 failed
anna-tikhonova commented 3 years ago

Thank you very much for reporting this issue. We will investigate and report back.

atw1020 commented 3 years ago

After updating to the 0.1alpha-2 release I get a different error

NotImplementedError: Cannot convert a symbolic Tensor (gru/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

I can still replicate the issue on my 0.1alpha-1 release though

takgoya commented 3 years ago

After updating to the 0.1alpha-2 release I get a different error

NotImplementedError: Cannot convert a symbolic Tensor (gru/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

I can still replicate the issue on my 0.1alpha-1 release though

I'm getting the same error when trying to implement LSTM layer (using 0.1alpha-2 release).

NotImplementedError: Cannot convert a symbolic Tensor (bidirectional/forward_lstm/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

Code to reproduce (from deeplearning.ai) ` import tensorflow as tf

from tensorflow.keras.preprocessing.text import Tokenizer from tensorflow.keras.preprocessing.sequence import pad_sequences from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Embedding, Bidirectional, LSTM

import numpy as np

tokenizer = Tokenizer() data="In the town of Athy one Jeremy Lanigan \n Battered away til he hadnt a pound. \nHis father died and made him a man again \n Left him a farm and ten acres of ground. \nHe gave a grand party for friends and relations \nWho didnt forget him when come to the wall, \nAnd if youll but listen Ill make your eyes glisten \nOf the rows and the ructions of Lanigans Ball." corpus = data.lower().split("\n") tokenizer.fit_on_texts(corpus) total_words = len(tokenizer.word_index) + 1

print(tokenizer.word_index) print(total_words)

input_sequences = [] for line in corpus: token_list = tokenizer.texts_to_sequences([line])[0] for i in range(1, len(token_list)): n_gram_sequence = token_list[:i+1] input_sequences.append(n_gram_sequence)

max_sequence_len = max([len(x) for x in input_sequences]) input_sequences = np.array(pad_sequences(input_sequences, maxlen=max_sequence_len, padding="pre"))

xs, labels = input_sequences[:, :-1], input_sequences[:, -1] ys = tf.keras.utils.to_categorical(labels, num_classes=total_words)

model = Sequential() model.add(Embedding(total_words, 64, input_length=max_sequence_len-1)) model.add(Bidirectional(LSTM(20))) model.add(Dense(total_words, activation="softmax"))

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) history = model.fit(xs, ys, epochs=500, verbose=1) `

matt-audio commented 3 years ago

I also got the same error with just simple GRU funcution, if only Dense, it is ok. " NotImplementedError: Cannot convert a symbolic Tensor (gru/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported " This is simple code: feature_input = keras.Input( shape =(None,257) , name = 'feature_input') gru1 = keras.layers.GRU(257, return_sequences=True)(feature_input) out = keras.layers.Dense(257,activation='sigmoid')(gru1) model = keras.Model( feature_input , out)

atw1020 commented 3 years ago

My issue has been fixed in 01-alpha3.

gru/PartitionedCall:0 succeeded
gru_1/strided_slice_3:0 succeeded
gru_2/PartitionedCall:0 succeeded
gru_3/strided_slice_3:0 succeeded
gru_4/strided_slice_3:0 succeeded