Open Utanapishtim31 opened 11 months ago
Hello, you can use RnnOptionalArgs.Mask
as a Workaround, the following code provide an example:
using static Tensorflow.KerasApi;
using Tensorflow;
using Tensorflow.NumPy;
using Tensorflow.Keras.ArgsDefinition;
var layers = keras.layers;
var rnnLayer = layers.GRU(units:100);
Tensors inputs = np.zeros(new Shape(1, 10, 5), dtype:TF_DataType.TF_FLOAT);
var res = rnnLayer.Apply(inputs, optional_args:new RnnOptionalArgs { Mask=inputs});
Console.WriteLine(res.ToString());
And thank you for the issue, I'll fix it. ^_^
With the following code, I get a KeyNotFound exception in predict():
var inputs = keras.Input((4, 2));
var inputs_mask = keras.Input((4, 1), dtype: TF_DataType.TF_BOOL);
RnnOptionalArgs rnnOptionalArgs = new RnnOptionalArgs();
rnnOptionalArgs.Mask = inputs_mask;
var rnn = keras.layers.GRU(10).Apply(inputs, optional_args: rnnOptionalArgs);
var output = keras.layers.Dense(2, activation: "softmax").Apply(rnn);
var model = keras.Model((inputs, inputs_mask), output);
model.summary();
NDArray x1 = np.random.random((1, 4, 2)).astype(TF_DataType.TF_FLOAT);
NDArray x2 = np.ones((1, 4, 1), TF_DataType.TF_BOOL);
var pred = model.predict((x1, x2));
Console.WriteLine(pred);
It looks like the graph structure does not detect that _inputsmask is somehow connected to the RNN: _Functional.tensor_usagecount does not include the tensor _inputsmask.
It works fine when mask = inputs probably because inputs is well-connected to the GRU.
Hello, the code you provided seems has some problem, I trans it into tensorflow python in the following, but it can not run successfully.
import tensorflow as tf
import tensorflow.keras as keras
inputs = keras.Input(shape=(4, 2))
inputs_mask = keras.Input(shape=(4, 1), dtype=tf.bool)
rnn = keras.layers.GRU(10)(inputs, mask=inputs_mask)
output = keras.layers.Dense(2, activation="softmax")(rnn)
model = keras.Model(inputs=[inputs, inputs_mask], outputs=output)
model.summary()
Hi,
My mistake. The mask input should have one dimension less:
inputs_mask = keras.Input(shape=(4,), dtype=tf.bool)
In C#, you can keep the same code with the following change:
var inputs_mask = keras.Input(new Shape(4), dtype: TF_DataType.TF_BOOL);
Then you get an exception "NotSupportedException (The collection has a fixed size)".
You are right, this is a bug.
As I explained above, the true origin of the problem is that the graph structure does not memorize the fact that _inputsmask is actually an input to the RNN. As a result, it is pruned and it fails later when it has to be used.
To confirm this, I have artificially connected _inputsmask to a second "dummy" dense layer whose output I connect with an Add layer to the output (because I cannot fit a multi-output model - that's another point). Then the model works fine with an LSTM recurrent layer.
With a GRU recurrent layer (as in the code here), there is yet another problem afterwards when trying to fit the model with an exception telling that a Tensorflow primitive "SplitV" is missing. I let you analyze this...
Sample code for the exception System.Collections.Generic.KeyNotFoundException during the training with a masked GRU:
internal class ZeroLayer : Layer
{
private Shape output_shape;
public ZeroLayer(Shape output_shape, string name = null)
: base(new LayerArgs { Name = name })
{
this.output_shape = output_shape;
}
protected override Tensors Call(Tensors inputs, Tensors state = null, bool? training = null, IOptionalArgs optional_args = null)
{
return tf.zeros(this.output_shape, dtype: TF_DataType.TF_FLOAT);
}
public override Shape ComputeOutputShape(Shape input_shape)
{
return this.output_shape;
}
}
Then fit the model:
var inputs = keras.Input((4, 2));
var inputs_mask = keras.Input(new Shape(4), dtype: TF_DataType.TF_BOOL);
RnnOptionalArgs rnnOptionalArgs = new RnnOptionalArgs();
rnnOptionalArgs.Mask = inputs_mask;
var rnn = keras.layers.LSTM(10).Apply(inputs, optional_args: rnnOptionalArgs);
var x = keras.layers.Dense(2, activation: "softmax").Apply(rnn);
var y = new ZeroLayer(new Shape(2)).Apply(inputs_mask);
var output = keras.layers.Add().Apply(new Tensors(x, y));
var model = keras.Model((inputs, inputs_mask), output);
model.summary();
NDArray x1 = np.random.random((1, 4, 2)).astype(TF_DataType.TF_FLOAT);
NDArray x2 = np.ones((1, 4), TF_DataType.TF_BOOL);
var pred = model.predict((x1, x2));
Console.WriteLine(pred);
NDArray[] train_inputs = new NDArray[2] { x1, x2 };
NDArray train_y = np.zeros(new Shape(1, 2), dtype: TF_DataType.TF_FLOAT);
train_y[0, 0] = 1.0f;
model.compile("adam", "categorical_crossentropy", new string[] { "accuracy" });
model.fit(train_inputs, train_y, batch_size: 1, epochs: 1);
With an LSTM rnn, model.fit() works fine. Replace it with a GRU rnn and you get a System.Collections.Generic.KeyNotFoundException. Fitting the model requires a Tensorflow primitive SplitV which is not in a dictionary.
Very thanks for your example, I will let you know once this bug be fixed.
Description
In order to set a mask for a GRU layer, I must declare it with GRUOptionalArgs.Mask. But the class GRUOptionalArgs does not implement the interface IOptionalArgs (probably an omission...), so it cannot be passed to GRU.Call() !
Please note that GRU.Call() checks for a GRUOptionalArgs and does not accept an RnnOptionalArgs.
Reproduction Steps
Try to compile the following code:
GRUArgs gruArgs = new GRUArgs(); gruArgs.Units = 100; GRU rnnLayer = new GRU(gruArgs);
GRUOptionalArgs rnnOptionalArgs = new GRUOptionalArgs(); Tensors inputs = np.zeros(new Shape(1, 10, 5))
rnnLayer.Apply(inputs, optional_args: rnnOptionalArgs);
Known Workarounds
Use an LSTM ???
Configuration and Other Information
Tensorflow.NET 0.110.4 Tensorflow.Keras 0.11.4 .Net Framework 4.7.2 Windows 11