Dobiasd / frugally-deep

A lightweight header-only library for using Keras (TensorFlow) models in C++.
MIT License
1.07k stars 236 forks source link

Image to Image Regression problem #119

Closed martinuray closed 5 years ago

martinuray commented 5 years ago

I have a python application, where I use a derivate of U-Net (Image in - Image out). When I try to predict this with frugally-deep I always run into the problem, that during prediction the execution terminates with:

terminate called after throwing an instance of 'std::runtime_error'
  what():  invalid output shape

My c++ code, importing the model is a derivate of the CImg example:

// Example code for how to:
// - load an image using CImg
// - convert it to a fdeep::tensor5
// - use it as input for a forward pass on an image classification model
// - print the class number

// compile with:
// g++ -std=c++14 -O3 cimg_example.cpp -L/usr/X11R6/lib -lm -lpthread -lX11 -o cimg_example -I /usr/local/include/eigen3

#include <fdeep/fdeep.hpp>

#include "CImg.h"

fdeep::tensor5 cimg_to_tensor5(const cimg_library::CImg<unsigned char>& image,
    fdeep::float_type low = 0.0f, fdeep::float_type high = 1.0f)
{
    const int width = image.width();
    const int height = image.height();
    const int channels = image.spectrum();
    printf("Image Input Shape: %d %d %d\n", image.width(), image.height(), image.spectrum());

    std::vector<unsigned char> pixels;
    pixels.reserve(height * width * channels);

    // CImg stores the pixels of an image non-interleaved:
    // http://cimg.eu/reference/group__cimg__storage.html
    // This loop changes the order to interleaved,
    // e.e. RRRGGGBBB to RGBRGBRGB for 3-channel images.
    for (int y = 0; y < height; y++)
    {
        for (int x = 0; x < width; x++)
        {
            for (int c = 0; c < channels; c++)
            {
                pixels.push_back(image(x, y, 0, c));
            }
        }
    }

    return fdeep::tensor5_from_bytes(pixels.data(), height, width, channels,
        low, high);
}

int main()
{
    cimg_library::CImg<unsigned char> image;
    image.load_raw("image.raw");
    image.resize(256,256, 1);
    const auto model = fdeep::load_model("foo.json");
    const auto input = cimg_to_tensor5(image, 0.0f, 1.0f);
    const auto result = model.predict_single_output({input});
    std::cout << result << std::endl;
    return 0;
}

By looking into the code, I observed that there is an assertion, where the output's volume of a prediction must be equal to one. Doesn't that infer with a lot of regression tasks?

Dobiasd commented 5 years ago

Hi Martin,

thanks for the clear report.

Please try fdeep::model::predict instead of fdeep::model::predict_single_output, i.e.:

int main()
{
    cimg_library::CImg<unsigned char> image;
    image.load_raw("image.raw");
    image.resize(256, 256, 1);
    const auto model = fdeep::load_model("foo.json");
    const auto input = cimg_to_tensor5(image, 0.0f, 1.0f);
    const auto result = model.predict({input});
    std::cout << fdeep::show_tensor5s(result) << std::endl;
}
martinuray commented 5 years ago

Hi, thanks. That solved my issue!