SciSharp / TensorFlow.NET

.NET Standard bindings for Google's TensorFlow for developing, training and deploying Machine Learning models in C# and F#.
https://scisharp.github.io/tensorflow-net-docs
Apache License 2.0
3.17k stars 506 forks source link

[BUG Report]: Cannot use BatchNormalization layer #1194

Closed Utanapishtim31 closed 8 months ago

Utanapishtim31 commented 8 months ago

Description

When I try to use a BatchNormalization layer inside a Keras model, an exception is raised during the building of the layer. During the execution of BatchNormalization.Call(), the parameter training has the value false and the evaluation :

var training_value = tf_utils.constant_value(training_tensor)

is done, but it ends in smart_module.smart_constant_value(Tensor pred) with the exception NotImplementedException because _trainingtensor has the value tf.logical_and(training.Value, base.Trainable).

Reproduction Steps

A simple neural network with a dense layer followed by a BatchNormalization layer should reproduce the problem.

Known Workarounds

No workaround found...

Configuration and Other Information

Tensorflow.NET v0.110.4 Tensorflow.Keras v0.11.4 .NET 4.7.2 Windows 11 (Version 10.0.22621)

Wanglongzhi2001 commented 8 months ago

Hello, could you mind provide a minimal example?

Wanglongzhi2001 commented 8 months ago

Hello, The following code which use BatchNormalization layer runs successfully. Maybe there's something wrong with your code? Could you please provide a minimal code for reproduction?

using static Tensorflow.Binding;
using static Tensorflow.KerasApi;
using Tensorflow;
using Tensorflow.NumPy;

var layers = keras.layers;
// input layer
var inputs = keras.Input(shape: (32, 32, 3), name: "img");
// convolutional layer
var x = layers.Conv2D(32, 3, activation: "relu").Apply(inputs);
x = layers.Conv2D(64, 3, activation: "relu").Apply(x);
var block_1_output = layers.MaxPooling2D(3).Apply(x);
x = layers.Conv2D(64, 3, activation: "relu", padding: "same").Apply(block_1_output);
x = layers.Conv2D(64, 3, activation: "relu", padding: "same").Apply(x);
x = layers.BatchNormalization().Apply(x);
var block_2_output = layers.Add().Apply(new Tensors(x, block_1_output));
x = layers.Conv2D(64, 3, activation: "relu", padding: "same").Apply(block_2_output);
x = layers.Conv2D(64, 3, activation: "relu", padding: "same").Apply(x);
x = layers.BatchNormalization().Apply(x);
var block_3_output = layers.Add().Apply(new Tensors(x, block_2_output));
x = layers.Conv2D(64, 3, activation: "relu").Apply(block_3_output);
x = layers.BatchNormalization().Apply(x);
x = layers.GlobalAveragePooling2D().Apply(x);
x = layers.Dense(256, activation: "relu").Apply(x);
x = layers.Dropout(0.5f).Apply(x);
// output layer
var outputs = layers.Dense(10).Apply(x);
// build keras model
var model = keras.Model(inputs, outputs, name: "toy_resnet");
model.summary();
// compile keras model in tensorflow static graph
model.compile(optimizer: keras.optimizers.RMSprop(1e-3f),
    loss: keras.losses.SparseCategoricalCrossentropy(from_logits: true),
    metrics: new[] { "acc" });
// prepare dataset
var ((x_train, y_train), (x_test, y_test)) = keras.datasets.cifar10.load_data();
// normalize the input
x_train = x_train / 255.0f;
// training
model.fit(x_train[new Slice(0, 200)], y_train[new Slice(0, 200)],
            batch_size: 64,
            epochs: 1,
            validation_split: 0.2f);
Utanapishtim31 commented 8 months ago

Using the keras.layers to build the BatchNormalization layer instead of a creation by its constructor fixes the problem (maybe an error in my initialization arguments).