SciSharp / TensorFlow.NET

.NET Standard bindings for Google's TensorFlow for developing, training and deploying Machine Learning models in C# and F#.
https://scisharp.github.io/tensorflow-net-docs
Apache License 2.0
3.2k stars 514 forks source link

[BUG Report]: Error loading weights from a file #1108

Open RachamimYaakobov opened 1 year ago

RachamimYaakobov commented 1 year ago

Description

If in one software run - you create a model, save it, and load it again, you get an error. But if you create a model and save it, in one software run, and run the software again and only load the saved model - there is no error.

The error:

Tensorflow.ValueError: 'You are trying to load a weight file containing System.String[] layers into a model with 2 layers.'

Reproduction Steps

using System;
using static Tensorflow.Binding;
using static Tensorflow.KerasApi;
using Tensorflow;
using Tensorflow.NumPy;
using System.Xml.Linq;
using Tensorflow.Keras;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApp1
{
    internal class Program
    {
        static void Main(string[] args)
        {
            var x_train = np.array(new float[,] { { 0.1f }, { 0.2f }, { 0.3f }, { 0.4f } });
            var y_train = np.array(0, 1, 2, 3);

            var x_test = x_train;
            var y_test = y_train;

            create_fit_save(x_train, y_train, x_test, y_test);

            print("---------------------\n");

            var newModel = create_model();
            newModel.load_weights("model.weights");

            print("evaluate: ", newModel.evaluate(x_test, y_test, verbose: 2).Select(x => $"{x.Key}: {x.Value}, ").ToArray());

            var new_probability_model = keras.Sequential(new List<ILayer>() {
                newModel,
                tf.keras.layers.Softmax()
            });

            print("predict: ", new_probability_model.predict(x_test));

            Console.ReadLine();
        }

        private static void create_fit_save(NDArray x_train, NDArray y_train, NDArray x_test, NDArray y_test)
        {
            var model = create_model();

            model.fit(x_train, y_train, epochs: 500, verbose: 0);

            print("evaluate: ", model.evaluate(x_test, y_test, verbose: 2).Select(x=> $"{x.Key}: {x.Value}, ").ToArray());

            var probability_model = keras.Sequential(new List<ILayer>() {
                model,
                tf.keras.layers.Softmax()
            });

            print("predict: ", probability_model.predict(x_test));

            model.save_weights("model.weights");
        }

        private static Tensorflow.Keras.Engine.Sequential create_model()
        {
            var model = keras.Sequential(new List<ILayer>() {
                tf.keras.layers.Dense(128, activation: "relu", input_shape: new Shape(1)),
                tf.keras.layers.Dense(4)
            });

            var loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits: true);

            model.compile(optimizer: new Tensorflow.Keras.Optimizers.Adam(),
                loss: loss_fn,
                metrics: new[] { "accuracy" });

            model.summary();

            return model;
        }       
    }
}

Known Workarounds

If you run this code you will get an error, but the model will be saved to disk.

If you comment out the line: create_fit_save(x_train, y_train, x_test, y_test); And run again, there will be no error.

Equivalent code in Python does not throw any error:

import tensorflow as tf

def create_model():
    model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=[1]),
    tf.keras.layers.Dense(4)
    ])

    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    model.compile(optimizer='adam',
                loss=loss_fn,
                metrics=['accuracy']) 
    return model

def create_fit_save(x_train, y_train, x_test, y_test):
    model = create_model()
    model.fit(x_train, y_train, epochs=500, verbose=0)
    print('evaluate: ', model.evaluate(x_test, y_test, verbose=2))
    probability_model = tf.keras.Sequential([
        model,
        tf.keras.layers.Softmax()
    ])
    print('predict: ', probability_model.predict(x_test))
    model.save_weights("model.weights")

x_train = [[0.1], [0.2], [0.3], [0.4]]
y_train = [0, 1, 2, 3]

x_test = x_train # [[0.11], [0.22], [0.33], [0.44]]
y_test = y_train # [0, 0, 2, 3]

create_fit_save(x_train, y_train, x_test, y_test)

new_model = create_model()
new_model.load_weights("model.weights")
print('evaluate: ', new_model.evaluate(x_test, y_test, verbose=2))

new_probability_model = tf.keras.Sequential([
    new_model,
    tf.keras.layers.Softmax()
])

print('predict: ', new_probability_model.predict(x_test))

Configuration and Other Information

Tensorflow.NET: 0.100.5 .NET: 7.0 OS: win10

The problem occurs because each layer has a name, and when you create a layer it is called "dense_1" when you create it again it is called "dense_2" and no longer matches the name that appears in the file on the disk.

Wanglongzhi2001 commented 1 year ago

Sorry to reply to you late, the h5 format is no longer recommended, and the implementation on the TensorFlow.NET side is not very complete, we recommend that you use model.save and keras.models.load_model api.

RachamimYaakobov commented 1 year ago

Thanks for your response, in the code below I use model.save and keras.models.load_model and get an error:

System.NullReferenceException: 'Object reference not set to an instance of an object.'

   at Tensorflow.Keras.Engine.Model.test_step(DataHandler data_handler, Tensor x, Tensor y)
   at Tensorflow.Keras.Engine.Model.test_function(DataHandler data_handler, OwnedIterator iterator)
   at Tensorflow.Keras.Engine.Model.evaluate(NDArray x, NDArray y, Int32 batch_size, Int32 verbose, Int32 steps, Int32 max_queue_size, Int32 workers, Boolean use_multiprocessing, Boolean return_dict, Boolean is_val)
   at ConsoleApp1.Program.Main() 

My test shows that Model.compiled_loss is null as long as I did not compile the model, hence the error, and since it is a model that I load from the disk I should not compile it but immediately use it.

using System;
using static Tensorflow.Binding;
using static Tensorflow.KerasApi;
using Tensorflow.NumPy;
using Tensorflow.Keras;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApp1
{
    internal class Program
    {
        private const string ModelPath = "model";

        static void Main()
        {
            var x_train = np.array(new float[,] { { 0.1f }, { 0.2f }, { 0.3f }, { 0.4f } });
            var y_train = np.array(0, 1, 2, 3);

            var x_test = x_train;
            var y_test = y_train;

            create_fit_save(x_train, y_train, x_test, y_test);

            print("---------------------\n");

            var newModel = keras.models.load_model(ModelPath);

            print("evaluate: ", newModel.evaluate(x_test, y_test, verbose: 2).Select(x => $"{x.Key}: {x.Value}, ").ToArray()); // ERROR 'Object reference not set to an instance of an object.'

            var new_probability_model = keras.Sequential(new List<ILayer>() {
                newModel,
                tf.keras.layers.Softmax()
            });

            print("predict: ", new_probability_model.predict(x_test));

            Console.ReadLine();
        }

        private static void create_fit_save(NDArray x_train, NDArray y_train, NDArray x_test, NDArray y_test)
        {
            var model = create_model();
            model.fit(x_train, y_train, epochs: 500, verbose: 0);

            print("evaluate: ", model.evaluate(x_test, y_test, verbose: 2).Select(x => $"{x.Key}: {x.Value}, ").ToArray());

            var probability_model = keras.Sequential(new List<ILayer>() {
                model,
                tf.keras.layers.Softmax()
            });

            print("predict: ", probability_model.predict(x_test));

            model.save(ModelPath);
        }

        private static Tensorflow.Keras.Engine.Sequential create_model()
        {
            var model = keras.Sequential(new List<ILayer>() {
                tf.keras.layers.Dense(128, activation: "relu", input_shape: new Tensorflow.Shape(1)),
                tf.keras.layers.Dense(4)
            });

            var loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits: true);

            model.compile(optimizer: new Tensorflow.Keras.Optimizers.Adam(),
                loss: loss_fn,
                metrics: new[] { "accuracy" });

            model.summary();

            return model;
        }
    }
}
Wanglongzhi2001 commented 1 year ago

Hello, for some historical reasons, the developer at that time missed this problem because this API involved too much code, so you have to compile newModel again, if the latter API is fixed and you are interested at this, I will notify you.^_^