Dobiasd / frugally-deep

A lightweight header-only library for using Keras (TensorFlow) models in C++.
MIT License
1.06k stars 236 forks source link

Cannot load InceptionV3 model #175

Closed Terminou closed 5 years ago

Terminou commented 5 years ago

So, I successfully loaded some models and predicted them.

Yet, when I tried to load InceptionV3 model, I get an error. There was not any errors when I converted the model from 'h5' to 'json' but the code below does not work.

image

The error I got

image

Dobiasd commented 5 years ago

Can you please show the actual error? There is a message (msg) tied to the thrown exception.

Terminou commented 5 years ago

image Do you mean this one?

Dobiasd commented 5 years ago

No, that's not yet the content of the msg variable.

You can hover over the variable with your mouse, or inspect its content in the debugger window (local variables).

Another option would be to look into the stack trace and check the origin of the call to raise_error.

Or, completely different, use the release mode instead of the debug mode, and let the exception bubble up the call stack, such that it is displayed in your console window.

See, that's exactly the reason why I refused to do for-free consulting in our email conversation. Those are very basic C++ skills, that I'd try to avoid spending time teaching. :wink:

Terminou commented 5 years ago

I handled the problem discussed in the email chat by reading README.me from the beginning. I do accept that I studied Java, C and Python etc. This is my almost first time with C++.

By the way, I already use the release mode, here is my call stack.

image image

Dobiasd commented 5 years ago

You might be in release mode, but you are still running your application with the debugger attached. Otherwise, execution would not pause at the throw, and you would not see the call stack in your IDE.

But at least we now see the origin of the error. :tada:

Can you please upload your h5 file, so I can try to reproduce the issue?

Dobiasd commented 5 years ago

OK, so I converted the model you just sent me:

python3 convert_model.py vehicle.h5 vehicle.json

and tried to load it in C++:

#include "fdeep/fdeep.hpp"
int main()
{
    fdeep::load_model("vehicle.json");
}
g++ -std=c++14 -O3 ./main.cpp -o fdeep_issue_175
./fdeep_issue_175

The output is:

Loading json ... done. elapsed time: 1.679266 s
Running test 1 of 1 ... terminate called after throwing an instance of 'std::runtime_error'
  what():  Invalid axis (7) for tensor concatenation.
Aborted

So yes, I can reproduce the problem and now we know the actual error message, i.e., Invalid axis (7) for tensor concatenation..

Can you share the Python code that stitches together the model's architecture? Axis 7 sounds a bit off.

Terminou commented 5 years ago

I do not have the Python code with me now I am at home. I need to go to the office to get it. Tomorrow I will send it.

Dobiasd commented 5 years ago

OK, from debugging the C++ part, it seems like at some point, there is attempt to concatenae 4 tensors with the following shapes along axis 3:

Dobiasd commented 5 years ago

I just pushed a commit and released a new version that fixes the concatenation error.

But now there is a Floating point exception in convolve_im2col (val_cnt % (out_height * out_width)), because out_height == 0 and out_width == 0.

Let's see how the model looks like tomorrow. It seems to not be a standard InceptionV3 architecture. :slightly_smiling_face:

Terminou commented 5 years ago

I could not load the standart InceptionV3 model as well. I know you have some test cases on it but can you try loading standart InceptionV3 model?

Tobias Hermann notifications@github.com, 9 Eyl 2019 Pzt, 22:52 tarihinde şunu yazdı:

I just pushed a commit https://github.com/Dobiasd/frugally-deep/commit/e4179390237418bc38ce3e77f9663a2187c0e8d8 and released a new version https://github.com/Dobiasd/frugally-deep/releases/tag/v0.9.6-p0 that fixes the concatenation error.

But now there is a Floating point exception in convolve_im2col (val_cnt % (out_height * out_width)), because out_height == 0 and out_width == 0.

Let's see how the model looks like tomorrow. It seems to not be a standard InceptionV3 architecture. 🙂

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Dobiasd/frugally-deep/issues/175?email_source=notifications&email_token=AGSFTGWJPIL67SFYYYKWBSLQI2SRPA5CNFSM4IU3H7X2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6I2LOI#issuecomment-529638841, or mute the thread https://github.com/notifications/unsubscribe-auth/AGSFTGSEC24352NG3TE6ZRTQI2SRPANCNFSM4IU3H7XQ .

-- Saygılarımla, Eren

Dobiasd commented 5 years ago

Ah! That is a very good remark, because this should always work.

So I just tested locally with the latest versions of frugally-deep (0.9.6), Keras (2.2.5) and TensorFlow (1.14.0).

import keras
model = keras.applications.inception_v3.InceptionV3(input_shape=(299, 299, 3))
model.save('inceptionv3.h5', include_optimizer=False)
python3 convert_model.py inceptionv3.h5 inceptionv3.json
#include "fdeep/fdeep.hpp"
int main()
{
    fdeep::load_model("inceptionv3.json");
}
g++ -std=c++14 -O3 ./main.cpp -o fdeep_issue_175_inceptionv3
./fdeep_issue_175_inceptionv3
Loading json ... done. elapsed time: 1.658766 s
Running test 1 of 1 ... done. elapsed time: 0.598047 s
Loading, constructing, testing of inceptionv3.json took 3.170117 s overall.

So it works here.

Thus I recommend we try to get the default InceptionV3 (from Keras) running with frugally-deep on your machine first. Hopefully, when this is fixed, your model also works.

So, as a first step, please make sure, that you are also using the latest official versions of the needed libraries. And then try with keras.applications.inception_v3.InceptionV3 again.

One way to find out, which versions you are currently using, is as follows (in Python):

import keras
import tensorflow as tf
print(keras.__version__)
print(tf.__version__)
Terminou commented 5 years ago

First, I tried to load InceptionV3 by following your instructions. It worked well on frugally-deep.

Then, I changed the input_shape like the code fragment below.

m = keras.applications.inception_v3.InceptionV3(include_top=True, weights=None, input_tensor=None, input_shape=None, pooling=None, classes=1000)

The produced json did not work this time. I suppose, it is related to input_shape.

Dobiasd commented 5 years ago

I just tried to reproduce the problem, but did not succeed, i.e., it worked flawlessly.

This is what I did:

import keras
model = keras.applications.inception_v3.InceptionV3(include_top=True, weights=None, input_tensor=None, input_shape=None, pooling=None, classes=1000)
model.save('inceptionv3_none.h5', include_optimizer=False)
python3 convert_model.py inceptionv3_none.h5 inceptionv3_none.json
#include "fdeep/fdeep.hpp"
int main()
{
    fdeep::load_model("inceptionv3_none.json");
}
g++ -std=c++14 -O3 ./main.cpp -o fdeep_issue_175_inceptionv3_none
./fdeep_issue_175_inceptionv3_none
Loading json ... done. elapsed time: 1.685637 s
Running test 1 of 1 ... done. elapsed time: 0.596954 s
Loading, constructing, testing of inceptionv3_none.json took 3.213242 s overall.

Can you please check if this is exactly what you are doing too?

Also, can you please check if you are using frugally-deep version 0.9.6, Keras version 2.2.5 and TensorFlow version 1.14.0?

Terminou commented 5 years ago

Hello again, previous problems I mentions are solved. Now, I have an irrelevant question, I'd be glad if you helped me on that as well.

const auto res2 = model.predict({ input });

auto x = *res2[0].as_vector();
cout << "\n\nConfidence value = " << *max_element(x.begin(), x.end()) << "\n\n";

Above the maximum value of confidence value is calculated. Yet, I do not feel comfortable doing so since I am finding it manually by using a vector. I know that the maximum value is calculated in your predict function but it is ısed to detect label of the maximum value.

Do you have any function work like that? If you do not, I highly suggest to implement a new one.

Dobiasd commented 5 years ago

previous problems I mentions are solved.

Could you please tell how you solved them? This might not only be interesting for me, but also for other users with a similar problem, reading that thread here. Did you solve it by installing the latest versions of Keras and TensorFlow?


Do you have any function work like that?

You mean fdeep::model::predict_class? :slightly_smiling_face:

Terminou commented 5 years ago

I loaded the model on Python and specified input shape. I did not update or downgrade my current Keras or TF versions.

fdeep::model::predict_class finds the label of the prediction which has the maximum confidence value. I need the confidence of that label.

For example you are testing a simple cat image. Assume that you have a label list containing animals (giraffe, monkey, dog, cat etc.). Cat is the 3rd element of the label list. When I use predict_class function, it will return to 3.

When I invoke model.predict({ input }) with the cat image, it will return confidence values of each elements in the label such as (0.0000, 0.0000, 0.0002, 0.9998). Here the cat has the maximum value. Finding this value is exactly what I want.

Note that I do not want to use the strategy below.

const auto res2 = model.predict({ input });

auto x = *res2[0].as_vector();
cout << "\n\nConfidence value = " << *max_element(x.begin(), x.end()) << "\n\n";
Dobiasd commented 5 years ago

With FunctionalPlus, we could make it a bit more clear:

const auto res2 = model.predict({ input });

auto x = *res2[0].as_vector();
cout << "\n\nConfidence value = " << fplus::maximum(x) << "\n\n";

But I guess you are looking for something more like the following

const auto res2 = model.predict_class_with_confidence({ input });

cout << "Prediction = " << res2.first << std::endl;
cout << "Confidence = " << res2.second << std::endl;

with predict_class_with_confidence being declared like that, right?

std::pair<std::size_t, float_type> fdeep::model::predict_class_with_confidence(const tensor5s& inputs) const;
Terminou commented 5 years ago

Yes, that is exactly what I am looking for. 👍

Dobiasd commented 5 years ago

Nice idea. :+1: Here it is. :heavy_check_mark:

Dobiasd commented 5 years ago

To come back once more to the initial problem:

previous problems I mentions are solved.

Could you please tell how you solved them?

I loaded the model on Python and specified input shape. I did not update or downgrade my current Keras or TF versions.

So we don't know why keras.applications.inception_v3.InceptionV3(include_top=True, weights=None, input_tensor=None, input_shape=None, pooling=None, classes=1000) poses a problem on your machine, but not on mine, right?

Terminou commented 5 years ago

Right. 🉑

Dobiasd commented 5 years ago

OK, but at least the important thing, i.e., your actual use-case, is working now. :tada:

Terminou commented 5 years ago

It does indeed. One more question: I predicted a model 1000 times in a loop with frugally-deep. Average prediction time is roughly 1 second. However, it took 0.55 second on Python. Is it normal? Do you expect these results?

Dobiasd commented 5 years ago

Yes, TensorFlow might use the GPU or multiple CPU cores for one prediction. Frugally-deep does not do that. However, you can run multiple predictions in parallel. This will probably speed up your prediction throughput almost proportionally to the number of CPU cores in your machine.

Dobiasd commented 5 years ago

previous problems I mentions are solved.

Could you please tell how you solved them?

I loaded the model on Python and specified input shape. I did not update or downgrade my current Keras or TF versions.

So we don't know why keras.applications.inception_v3.InceptionV3(include_top=True, weights=None, input_tensor=None, input_shape=None, pooling=None, classes=1000) poses a problem on your machine, but not on mine, right?

Right.

OK, I would appreciate, if you at least could quickly tell me the library versions you are using. Just run the following Python code on your machine and copy-paste the output:

import keras
import tensorflow as tf
print(keras.__version__)
print(tf.__version__)

If you are doing prediction-performance measurements to compare runtimes in C++ with Python, see if you are actually measuring the same thing, and not have some other time-consuming stuff in the 1000-image in one version, like loading images etc.


I hope the implementation of fdeep::model::predict_class_with_confidence is working for you.


Regarding parallel predictions for more throughput: You could try to use fplus::transform_parallelly for that purpose:

#include <fdeep/fdeep.hpp>
#include <opencv2/opencv.hpp>

fdeep::tensor5 load_and_convert_image(const std::string& path)
{
    const cv::Mat image = cv::imread(path);
    cv::cvtColor(image, image, cv::COLOR_BGR2RGB);
    return fdeep::tensor5_from_bytes(image.ptr(),
        static_cast<std::size_t>(image.rows),
        static_cast<std::size_t>(image.cols),
        static_cast<std::size_t>(image.channels()),
        0.0f, 1.0f);
}

int main()
{
    const auto model = fdeep::load_model("my_model.json");
    std::vector<std::string> image_paths = {
        "image_0001.jpg",
        "image_0002.jpg"
        // ...
    };

    const auto results = fplus::transform_parallelly([&](const auto& image_path)
    {
        const auto input = load_and_convert_image(image_path);
        return model.predict_class_with_confidence({input});
    }, image_paths);
}