Sudden deviations for converted neural network

Hello! :) After using frugally-deep for the last couple of months without any issues, we found some discrepancies when comparing a keras network and the converted .jsonfile (more details below).

Currently, we are running on

TensorFlow 2.13 (a later version is unfortunately not available on the server we use for the training)
the latest frugally-deep version and convert_model.py

We train a relatively simple network:

49 inputs
2 x (Dense layer with 512 nodes + PreLu activation + dropout layer)
Dense layer with 6 nodes (only in the correctly working example _nodeviations we had here in addition a dropout layer) + Softmax

The conversion (and implementation in our analysis framework) worked for the first network perfectly fine with deviations somewhere on the single-precision level (see no_deviations.json and no_deviations.keras)

After some updates, we converted the new model (without the dropout layer following the dense layer with 6 nodes). We came across some deviations in the order of 10e-1 (see deviations.json and deviations.keras). One strange behavior is that the deviations we observe in our analysis framework become larger the longer we train our model. Retraining the model did not solve the problem.

We tested 3 different inputs:

49 zeros (the input is normalized so that mean = 0)
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -1.8184667, -1.0182238, 0.15304942, 0.99804157, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
[-0.15723465, -0.18722926, -0.14018555, -0.54661024, -0.3598269, -0.13201863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.04876948, -0.10836805, 0.11795691, -0.3889477, -0.08791534, 0.26476863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -2.6917076, -0.47304294, -3.6160529, -0.9949142, -0.47304294, -4.5883393, -1.7162547, 1.7362174, 0.46023318, -1.0019623, -0.43958497, 0.21765545, 0.716984, 0.2811993, 0.4104652, -0.041849896, 0.2102925, -0.4721365, -0.7124588]

For the first 2 examples, the deviations are small; however, for the last, we see large deviations: deviations.json output: [0.0, 0.7772999, 0.0, 0.0, 0.22270015, 0.0] deviations.keras output: [0.0, 0.26894143, 0.0, 0.0, 0.73105854, 0.0]

The 2 different networks in .keras and .json format can be found here: https://syncandshare.lrz.de/getlink/fi13NA5BiRsof71omTc8Be/frugally-deep-issue

For evaluating the .keras network, we use the following code:

import numpy as np
from pathlib import Path
import tensorflow as tf

nn = Path("/path/to/neural_network")
model = tf.keras.models.load_model(nn)
inData = tf.convert_to_tensor([np.array([ -0.157234653830528, -0.187229260802269, -0.140185549855232, -0.546610236167908, -0.359826892614365, -0.132018625736237, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.0487694814801216, -0.108368046581745, 0.117956906557083, -0.388947695493698, -0.08791533857584, 0.26476863026619, 1, 1, 1, 1, 1, 1, -2.69170761108398, -0.47304293513298, -3.61605286598206, -0.994914174079895, -0.47304293513298, -4.58833932876587, -1.71625471115112, 1.73621737957001, 0.460233181715012, -1.0019623041153, -0.439584970474243, 0.217655450105667, 0.71698397397995, 0.281199306249619, 0.410465210676193, -0.0418498963117599, 0.210292503237724, -0.472136497497559, -0.712458789348602], dtype = np.float32)])
predict = model.predict(inData)[0]
compare = [0.0, 0.7772999, 0.0, 0.0, 0.22270015, 0.0]
comp = np.array(compare,dtype = np.float32)
print(list(predict))
print(list(comp))
for i in range(6):
    print(predict[i]-comp[i])

For the frugally-deep evaluation, we use:


#include <string>
#include<iostream>
#include<iomanip>
#include<iostream>
#include <fdeep/fdeep.hpp>
#include <fplus/fplus.hpp>

template<typename T, typename T2>
std::string printVec(const std::vector<T,T2>& outputs)
{
    std::stringstream s;
    s << "[ ";
    for (const auto o: outputs) s << std::setprecision(15) << o << ", ";
    s << "]";
    return s.str();
}

int main(int argc, char* argv[]){

    // Check if exactly one argument is provided (in addition to the program name)
    if (argc != 2) {
        std::cerr << "Usage: " << argv[0] << " <json_path>" << std::endl;
        return 1; // Exit with an error code
    }

    // The input string is in argv[1]
    const std::string jsonFilePath = argv[1];

    const float verify_epsilon = 1e-4; // default
    // const float verify_epsilon = 1e-80;

    std::cout << "Testing network in json file: '" << jsonFilePath << "'" << std::endl;
    std::cout << "Verify with precision " << verify_epsilon << std::endl;

    const auto model = fdeep::load_model(jsonFilePath,true, fdeep::cout_logger, verify_epsilon);

    std::cout << "Successfully loaded" << std::endl;

    const auto n_inputs = model.get_dummy_input_shapes()[0].dimensions()[0];

    {
        std::cout << "Network output for all inputs equal to zero" << std::endl;
        std::vector<float> inputs;
        for (size_t i=0; i < n_inputs; ++i) inputs.push_back(0.0f);

        const auto results = model.predict({fdeep::tensor(model.get_dummy_input_shapes()[0], inputs)});

        const auto outputs = *results[0].as_vector();
        std::cout << "\t" << printVec(outputs) << std::endl;
    }

    {
        // std::vector<float> inputs = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -1.8184667, -1.0182238, 0.15304942, 0.99804157, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0};
        std::vector<float> inputs = {-0.15723465, -0.18722926, -0.14018555, -0.54661024, -0.3598269, -0.13201863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.04876948, -0.10836805, 0.11795691, -0.3889477, -0.08791534, 0.26476863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -2.6917076, -0.47304294, -3.6160529, -0.9949142, -0.47304294, -4.5883393, -1.7162547, 1.7362174, 0.46023318, -1.0019623, -0.43958497, 0.21765545, 0.716984, 0.2811993, 0.4104652, -0.041849896, 0.2102925, -0.4721365, -0.7124588};

        std::cout << "Network output for inputs equal to" << std::endl;
        std::cout << '\t' << printVec(inputs) << std::endl;

        const auto results = model.predict({fdeep::tensor(model.get_dummy_input_shapes()[0], inputs)});

        const auto outputs = *results[0].as_vector();
        std::cout << "\t" << printVec(outputs) << std::endl;
    }

    return 0;
}

Any help would be highly appreciated, cheers Martin

Oh, that's a big deviation! Thanks for the very good report. I'll look into it and get back to you here.

Oh, so many changes happened in TensorFlow during the year between version 2.13 (the version you are using) and version 2.16.1 (the version frugally-deep is tested on). I needed to make significant changes in frugally-deep to keep up with the TensorFlow changes. So I'd not be too surprised if your model no longer works with a newer frugally-deep version.

But, regarding versions: You wrote that you are using the latest frugally-deep version, which would be 0.16.0. I tried to reproduce the output-value deviation you see with it. But it can't even load any of your .json files.

Here is a Dockerfile to reproduce this (run docker build --rm --progress=plain .):

FROM python:3.12.4

RUN apt-get update
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install -y build-essential cmake

RUN pip3 install tensorflow==2.16.1

RUN apt-get remove --purge -y cmake
RUN pip install cmake --upgrade

RUN git clone -b 'v0.2.24' --single-branch --depth 1 https://github.com/Dobiasd/FunctionalPlus && cd FunctionalPlus && mkdir -p build && cd build && cmake .. && make && make install
RUN git clone -b '3.4.0' --single-branch --depth 1 https://gitlab.com/libeigen/eigen.git && cd eigen && mkdir -p build && cd build && cmake .. && make && make install && ln -s /usr/local/include/eigen3/Eigen /usr/local/include/Eigen
RUN git clone -b 'v3.11.3' --single-branch --depth 1 https://github.com/nlohmann/json && cd json && mkdir -p build && cd build && cmake -DJSON_BuildTests=OFF .. && make && make install
RUN git clone -b 'v0.16.0' --single-branch --depth 1 https://github.com/Dobiasd/frugally-deep && cd frugally-deep && mkdir -p build && cd build && cmake .. && make && make install

WORKDIR /frugally-deep

RUN wget https://syncandshare.lrz.de/dl/fi13NA5BiRsof71omTc8Be/frugally-deep-issue.dir -q -O models.zip
RUN unzip models.zip

RUN echo '#include "fdeep/fdeep.hpp"\n\
#include <iostream>\n\
int main()\n\
{\n\
    const auto model = fdeep::load_model("no_deviations.json");\n\
}' >> main.cpp

RUN g++ main.cpp -o main

RUN ./main

Output:

0.460 Building model ... main: /usr/local/include/nlohmann/json.hpp:2147: const nlohmann::json_abi_v3_11_3::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType, CustomBaseClass>::value_type& nlohmann::json_abi_v3_11_3::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType, CustomBaseClass>::operator[](const typename object_t::key_type&) const [with ObjectType = std::map; ArrayType = std::vector; StringType = std::__cxx11::basic_string<char>; BooleanType = bool; NumberIntegerType = long int; NumberUnsignedType = long unsigned int; NumberFloatType = double; AllocatorType = std::allocator; JSONSerializer = nlohmann::json_abi_v3_11_3::adl_serializer; BinaryType = std::vector<unsigned char>; CustomBaseClass = void; const_reference = const nlohmann::json_abi_v3_11_3::basic_json<>&; typename object_t::key_type = std::__cxx11::basic_string<char>; object_t = std::map<std::__cxx11::basic_string<char>, nlohmann::json_abi_v3_11_3::basic_json<>, std::less<void>, std::allocator<std::pair<const std::__cxx11::basic_string<char>, nlohmann::json_abi_v3_11_3::basic_json<> > > >]: Assertion `it != m_data.m_value.object->end()' failed.
0.537 Aborted (core dumped)

The fact, that you don't get this error shows, that you are not using the latest frugally-deep version (with the right json version etc.).

So I looked up what was the last frugally-deep release, that did support TensorFlow 2.13 (can be seen in README.md), and found it's version v0.15.30, and tested this one too.

FROM python:3.12.4

RUN apt-get update
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install -y build-essential cmake

RUN pip3 install tensorflow==2.16.1

RUN apt-get remove --purge -y cmake
RUN pip install cmake --upgrade

RUN git clone -b 'v0.2.24' --single-branch --depth 1 https://github.com/Dobiasd/FunctionalPlus && cd FunctionalPlus && mkdir -p build && cd build && cmake .. && make && make install
RUN git clone -b '3.4.0' --single-branch --depth 1 https://gitlab.com/libeigen/eigen.git && cd eigen && mkdir -p build && cd build && cmake .. && make && make install && ln -s /usr/local/include/eigen3/Eigen /usr/local/include/Eigen
RUN git clone -b 'v3.11.3' --single-branch --depth 1 https://github.com/nlohmann/json && cd json && mkdir -p build && cd build && cmake -DJSON_BuildTests=OFF .. && make && make install
RUN git clone -b 'v0.15.30' --single-branch --depth 1 https://github.com/Dobiasd/frugally-deep && cd frugally-deep && mkdir -p build && cd build && cmake .. && make && make install

WORKDIR /frugally-deep

RUN wget https://syncandshare.lrz.de/dl/fi13NA5BiRsof71omTc8Be/frugally-deep-issue.dir -q -O models.zip
RUN unzip models.zip

RUN echo '#include "fdeep/fdeep.hpp"\n\
#include <iostream>\n\
int main()\n\
{\n\
    const auto model = fdeep::load_model("deviations.json");\n\
}' >> main.cpp

RUN g++ main.cpp -o main

RUN ./main

Output

#19 [16/16] RUN ./main
#19 0.415 Loading json ... done. elapsed time: 0.050927 s
#19 0.466 Building model ... done. elapsed time: 0.089966 s
#19 0.556 Running test 1 of 1 ... done. elapsed time: 0.001132 s
#19 0.558 Loading, constructing, testing of deviations.json took 0.142561 s overall.
#19 DONE 0.6s

So here, deviations.json works fine, at least with the default test happening during loading, i.e., no significant deviations.

I could now try to reproduce the deviations you see with custom inputs, but as long as we're not sure about which versions you use, it would all be too fuzzy.

Can you try updating the TensorFlow version on your server (or run it somewhere else), re-train, and make sure you're testing with the latest frugally-deep (plus dependencies)? That would be the ideal. Because even if we would find out, that some old release of frugally-deep had a bug (causing these deviations) with some old TensorFlow version, I'd rather not fiddle around with outdated versions and instead try to make sure the latest one works correctly.

I just tried the latest frugally-deep with TensorFlow 2.17 (the latest TensorFlow) instead of 2.16.1, and this works too.

So instead of TensorFlow 2.16.1, you can also use TensorFlow 2.17 if you prefer. :+1:

But I can no longer support TensorFlow 2.13. :see_no_evil:

Thank you so much for your effort. I looked again at which version we used, and it's 0.15.9 (we did multiple rounds of conversion and validation on different systems; sorry for mixing this up), so it is even older than the version you proposed for TensorFlow 2.13. I would even go so far as to say it's a surprise that our version combination worked at some point. I think we did some additional conversion to TensorFlow 2.16 via the .h5 format (which is kind of backward compatible) for the latest frugally-deep version and got the same output for the .json network, but I think that's not really the safe thing to do.

Since getting the right modules installed on our training server is currently a bit problematic, and other servers would take too long for the final training result, I will use version 0.15.30 for the conversion and validation, and see whether this fixes it. If that's not working and for future training, I will try to get it running on the server on TensorFlow 2.16.1/2.17 as proposed by you. Either way, I'll let you know if this fixes the problems that we saw.

Until then, thank you again for your quick response, and apologies for mixing the version numbers. Cheers

Sounds good! And don't worry. :relaxed: Looking forward to the results. I'm always happy to see frugally-deep being useful for real projects. So if you still run into this problem (or others), just let me know and we look for a solution together.

Hi again :) We converted now with frugally-deep 0.15.30 and TensorFlow 2.13, the converted model deviations_v_0_15_30.json (also available in https://syncandshare.lrz.de/getlink/fi13NA5BiRsof71omTc8Be/frugally-deep-issue) produces the same output as the 0.15.9 version. It would be really great if you could check, whether you can reproduce these deviations to eliminate the possibility of bugs on our site. In addition, we get no warning that the converted model shows deviations during the conversion process, only after checking with costum samples.

We also checked again the model that we "updated" to TensorFlow 2.16 via the h5 format. This version gives again a different output, so Tensorflow v2.13 != v2.16 != .json in this regard.

It would be really great if you could check, whether you can reproduce these deviations to eliminate the possibility of bugs on our site.

Sure, here is the check:

FROM python:3.11.9

RUN apt-get update
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install -y build-essential cmake

RUN pip3 install tensorflow==2.13

RUN apt-get remove --purge -y cmake
RUN pip install cmake --upgrade

RUN git clone -b 'v0.2.24' --single-branch --depth 1 https://github.com/Dobiasd/FunctionalPlus && cd FunctionalPlus && mkdir -p build && cd build && cmake .. && make && make install
RUN git clone -b '3.4.0' --single-branch --depth 1 https://gitlab.com/libeigen/eigen.git && cd eigen && mkdir -p build && cd build && cmake .. && make && make install && ln -s /usr/local/include/eigen3/Eigen /usr/local/include/Eigen
RUN git clone -b 'v3.11.3' --single-branch --depth 1 https://github.com/nlohmann/json && cd json && mkdir -p build && cd build && cmake -DJSON_BuildTests=OFF .. && make && make install
RUN git clone -b 'v0.15.30' --single-branch --depth 1 https://github.com/Dobiasd/frugally-deep && cd frugally-deep && mkdir -p build && cd build && cmake .. && make && make install

WORKDIR /frugally-deep

RUN wget https://syncandshare.lrz.de/dl/fi13NA5BiRsof71omTc8Be/frugally-deep-issue.dir -q -O models.zip
RUN unzip models.zip

RUN python3 keras_export/convert_model.py deviations.keras deviations_converted.json

RUN echo 'import tensorflow as tf \n\
import numpy as np \n\
model = tf.keras.models.load_model("deviations.keras", compile=False) \n\
data = [-0.15723465, -0.18722926, -0.14018555, -0.54661024, -0.3598269, -0.13201863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.04876948, -0.10836805, 0.11795691, -0.3889477, -0.08791534, 0.26476863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -2.6917076, -0.47304294, -3.6160529, -0.9949142, -0.47304294, -4.5883393, -1.7162547, 1.7362174, 0.46023318, -1.0019623, -0.43958497, 0.21765545, 0.716984, 0.2811993, 0.4104652, -0.041849896, 0.2102925, -0.4721365, -0.7124588] \n\
result = model.predict(np.array([data])) \n\
print(result)' >> main.py

ADD "https://www.random.org/cgi-bin/randbyte?nbytes=10&format=h" skipcache
RUN echo '#include "fdeep/fdeep.hpp"' > main_single.cpp
RUN echo '#include <iostream>' >> main_single.cpp
RUN echo 'int main() \n\
{ \n\
    const auto model = fdeep::load_model("deviations_converted.json"); \n\
    std::vector<fdeep::float_type> inputs = {-0.15723465, -0.18722926, -0.14018555, -0.54661024, -0.3598269, -0.13201863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.04876948, -0.10836805, 0.11795691, -0.3889477, -0.08791534, 0.26476863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -2.6917076, -0.47304294, -3.6160529, -0.9949142, -0.47304294, -4.5883393, -1.7162547, 1.7362174, 0.46023318, -1.0019623, -0.43958497, 0.21765545, 0.716984, 0.2811993, 0.4104652, -0.041849896, 0.2102925, -0.4721365, -0.7124588}; \n\
    const auto results = model.predict({fdeep::tensor(model.get_dummy_input_shapes()[0], inputs)}); \n\
    std::cout << fdeep::show_tensors(results) << std::endl; \n\
}' >> main_single.cpp
RUN g++ main_single.cpp -o main_single

RUN echo '#define FDEEP_FLOAT_TYPE double' > main_double.cpp
RUN cat main_single.cpp >> main_double.cpp
RUN g++ main_double.cpp -o main_double

RUN python3 main.py
RUN ./main_single
RUN ./main_double

And indeed, I get the same results as you do:

TensorFlow:

[[0.         0.26894143 0.         0.         0.73105854 0.        ]]

frugally-deep

[[[[[[0.0000, 0.7773, 0.0000, 0.0000, 0.2227, 0.0000]]]]]]

Additionally, switching frugally.deep from single-precision floats to double-precision floats changes the result:

frugally-deep with #define FDEEP_FLOAT_TYPE double:

[[[[[[0.0000, 0.6123, 0.0000, 0.0000, 0.3877, 0.0000]]]]]]

In my experience, this can be an indicator, that the model might be missing some regularization, and thus tended towards very big or very small weights during training, which gives such numerical instability.

Of course, on the other hand, it might also simply be a bug in this outdated frugally-deep version:

In addition, we get no warning that the converted model shows deviations during the conversion process, only after checking with costum samples.

Yeah, same here. Conversion runs fine, and the automated test (during fdeep::load_model) seems fine. The instability only happens with the custom input.

We also checked again the model that we "updated" to TensorFlow 2.16 via the h5 format. This version gives again a different output, so Tensorflow v2.13 != v2.16 != .json in this regard.

That's a bit relieving to me because if different TensorFlow versions are also not aligned on the output of this model-and-input combination, it's another indicator of a lack of stability in the floating-point arithmetic.

One strange behavior is that the deviations we observe in our analysis framework become larger the longer we train our model.

That's another indicator, that something unwanted is happening during the training. Can you iterate over the weights (and maybe even intermediate tensors during prediction) and check if something looks suspiciously large?

Or maybe you can try replacing the PReLU activation with ELU or sigmoid.

The numeric instability might happen in the softmax layer. I checked the output of the last dense layer (before the softmax):

from keras.models import Model
import tensorflow as tf
import numpy as np
model = tf.keras.models.load_model("deviations.keras", compile=False)

model2 = Model(inputs=model.input, outputs=model.get_layer("dense_2").output)
#print(model2.predict(np.random.rand(1, 49)))

data = [-0.15723465, -0.18722926, -0.14018555, -0.54661024, -0.3598269, -0.13201863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.04876948, -0.10836805, 0.11795691, -0.3889477, -0.08791534, 0.26476863, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -2.6917076, -0.47304294, -3.6160529, -0.9949142, -0.47304294, -4.5883393, -1.7162547, 1.7362174, 0.46023318, -1.0019623, -0.43958497, 0.21765545, 0.716984, 0.2811993, 0.4104652, -0.041849896, 0.2102925, -0.4721365, -0.7124588]
print(model2.predict(np.array([data])))

[[-3056833.  -3053894.5 -3058091.  -3056993.2 -3053893.5 -3062269.5]]

Softmax might not be very stable with such inputs:

import tensorflow as tf
tf.nn.softmax([-3056833., -3053894.5, -3058091., -3056993.2, -3053893.5, -3062269.5])

<tf.Tensor: shape=(6,), dtype=float32, numpy=array([0.        , 0.26894143, 0.        , 0.        , 0.73105854,       0.        ], dtype=float32)>

Compare with just a small change (relative to the absolute numbers):

tf.nn.softmax([-3056833., -3053892.5, -3058091., -3056993.2, -3053893.5, -3062269.5])

<tf.Tensor: shape=(6,), dtype=float32, numpy=array([0.        , 0.73105854, 0.        , 0.        , 0.26894143,       0.        ], dtype=float32)>

Maybe the used test input is a "stimulus" the model does not "experience" during training? Is this input vector part of the training set or validation set? Do you get good results on the validation set (loss similarly low as for the training set) during training?

Hi again :) Thanks for trying these things out. For the samples: We actually don't know whether the example is from the training or validation sample, we just know that it comes from the sample from which we took the subsamples for training and validation. During the training the loss and AUROC are both better for the validation than the training sample, due to the dropout layers I suppose, so that should be well behaved.

Based on your investigations we added a batch normalization layer right before the Softmax and at least after ~700 epochs, there are no substantial deviations.

So thanks again, cheers Martin

Dobiasd / frugally-deep

Sudden deviations for converted neural network #422