jolibrain / deepdetect

Deep Learning API and Server in C++14 support for Caffe, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE
https://www.deepdetect.com/
Other
2.52k stars 561 forks source link

Absolute paths, std as int launch errors with torchlib (best -1 retrieves 0 results) #656

Closed YaYaB closed 5 years ago

YaYaB commented 5 years ago

Configuration

Your question / the problem you're facing:

I have seen weird use of some input in the prediction request for pytorch models.

Error message (if any) / steps to reproduce the problem:

Let's do as the #611 suggest: Download model

wget https://www.deepdetect.com/dd/examples/torch/resnet50_torch.tar.gz
tar xvf resnet50_torch.tar.gz

Run dede Start the service

curl -X PUT "http://localhost:8080/services/torch_resnet" -d '{
    "description": "image classification service",
    "mllib": "torch",
    "model": {
        "repository": "./resnet50_torch/"
    },
    "parameters": {
        "input": {
            "connector": "image"
        }
    },
    "type": "supervised"
}
'

It will fail wit the following error

{"status":{"code":500,"msg":"InternalError","dd_code":1007,"dd_msg":"open file failed, file path:  (FileAdapter at /home/yassine.bezza/deepdetect/build_torch/pytorch/src/pytorch/caffe2/serialize/file_adapter.cc:11)\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x66 (0x7fd092aa48f6 in /home/yassine.bezza/deepdetect/build_torch/pytorch/src/pytorch/torch/lib/libc10.so)\nframe #1: caffe2::serialize::FileAdapter::FileAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x222 (0x7fd094415192 in /home/yassine.bezza/deepdetect/build_torch/pytorch/src/pytorch/torch/lib/libcaffe2.so)\nframe #2: torch::jit::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, c10::optional<c10::Device>, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) + 0x40 (0x7fd096e4cd90 in /home/yassine.bezza/deepdetect/build_torch/pytorch/src/pytorch/torch/lib/libtorch.so.1)\nframe #3: dd::TorchLib<dd::ImgTorchInputFileConn, dd::SupervisedOutput, dd::TorchModel>::init_mllib(dd::APIData const&) + 0x1da (0x5af94a in ./dede)\nframe #4: dd::MLService<dd::TorchLib, dd::ImgTorchInputFileConn, dd::SupervisedOutput, dd::TorchModel>::init(dd::APIData const&) + 0x228 (0x590458 in ./dede)\nframe #5: dd::Services::add_service(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mapbox::util::variant<dd::MLService<dd::CaffeLib, dd::ImgCaffeInputFileConn, dd::SupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::CSVCaffeInputFileConn, dd::SupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::CSVTSCaffeInputFileConn, dd::SupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::TxtCaffeInputFileConn, dd::SupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::SVMCaffeInputFileConn, dd::SupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::ImgCaffeInputFileConn, dd::UnsupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::CSVCaffeInputFileConn, dd::UnsupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::CSVTSCaffeInputFileConn, dd::UnsupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::TxtCaffeInputFileConn, dd::UnsupervisedOutput, dd::CaffeModel>, dd::MLService<dd::CaffeLib, dd::SVMCaffeInputFileConn, dd::UnsupervisedOutput, dd::CaffeModel>, dd::MLService<dd::TorchLib, dd::ImgTorchInputFileConn, dd::SupervisedOutput, dd::TorchModel>, dd::MLService<dd::TorchLib, dd::TxtTorchInputFileConn, dd::SupervisedOutput, dd::TorchModel> >&&, dd::APIData const&) + 0xce (0x5913ce in ./dede)\nframe #6: dd::JsonAPI::service_create(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xdad (0x5f5ebd in ./dede)\nframe #7: APIHandler::operator()(boost::network::http::basic_request<boost::network::http::tags::http_server> const&, boost::network::http::basic_response<boost::network::http::tags::http_server>&) + 0xf00 (0x562c80 in ./dede)\nframe #8: boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>::handle_read_headers(boost::system::error_code const&, unsigned long) + 0xa93 (0x5645e3 in ./dede)\nframe #9: void boost::asio::detail::strand_service::dispatch<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long> >(boost::asio::detail::strand_service::strand_impl*&, boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long>&) + 0x7a (0x55c03a in ./dede)\nframe #10: void boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running>::operator()<boost::system::error_code, unsigned long>(boost::system::error_code const&, unsigned long const&) + 0x69 (0x55c2a9 in ./dede)\nframe #11: boost::asio::detail::completion_handler<boost::asio::detail::rewrapped_handler<boost::asio::detail::binder2<boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running>, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> > > >::do_complete(boost::asio::detail::task_io_service*, boost::asio::detail::task_io_service_operation*, boost::system::error_code const&, unsigned long) + 0x13a (0x55c45a in ./dede)\nframe #12: void boost::asio::detail::strand_service::dispatch<boost::asio::detail::rewrapped_handler<boost::asio::detail::binder2<boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running>, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> > > >(boost::asio::detail::strand_service::strand_impl*&, boost::asio::detail::rewrapped_handler<boost::asio::detail::binder2<boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running>, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> > >&) + 0x226 (0x55c756 in ./dede)\nframe #13: boost::asio::detail::reactive_socket_recv_op<boost::asio::mutable_buffers_1, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<boost::shared_ptr<boost::network::http::sync_connection<boost::network::http::tags::http_server, APIHandler> > >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running> >::do_complete(boost::asio::detail::task_io_service*, boost::asio::detail::task_io_service_operation*, boost::system::error_code const&, unsigned long) + 0x243 (0x55ca23 in ./dede)\nframe #14: boost::network::http::sync_server_base<boost::network::http::tags::http_server, APIHandler>::run() + 0x495 (0x5578c5 in ./dede)\nframe #15: <unknown function> + 0xb8c80 (0x7fd091a64c80 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)\nframe #16: <unknown function> + 0x76ba (0x7fd09a5c76ba in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #17: clone + 0x6d (0x7fd090fa841d in /lib/x86_64-linux-gnu/libc.so.6)\n"}}%                                                                                                          

However it works well if I put the absolute path to the model.

When the service is correctly launched I have the same issue with the data to predict. Run predict to guess the class of an image

curl -X POST "http://localhost:8080/predict" -d '{
    "service": "torch_resnet",
    "parameters": {
        "input": {
            "width":224,
            "height":224,
            "rgb":true,
            "std":255.0
        },
        "output": {
            "best":3
        }
    },
    "data":["./resnet50_torch/cat.jpg"]
}

It will fail with the following error if the path is not absolute.

{"status":{"code":400,"msg":"BadRequest","dd_code":1005,"dd_msg":"Service Input Error: /build/opencv-ys8xiq/opencv-2.4.9.1+dfsg/modules/imgproc/src/color.cpp:3648: error: (-215) scn == 3 || scn == 4 in function cvtColor\n"}}%                                                                                                          

When the path is set as absolute it works well an I got the following result:

{"status":{"code":200,"msg":"OK"},"head":{"method":"/predict","service":"torch_resnet","time":355.0},"body":{"predictions":[{"classes":[{"prob":0.3312813341617584,"cat":"n02123045 tabby, tabby cat"},{"prob":0.25223401188850405,"cat":"n02123597 Siamese cat, Siamese"},{"prob":0.23331482708454133,"last":true,"cat":"n02120505 grey fox, gray fox, Urocyon cinereoargenteus"}],"uri":"/home/yassine.bezza/resnet50_torch/cat.jpg"}]}}%                                                                                                                                                                                                                  

Now, once again if a modify the request and change for instance the std from float to int like the following

curl -X POST "http://localhost:8080/predict" -d '{
    "service": "torch_resnet",
    "parameters": {
        "input": {
            "width":224,
            "height":224,
            "rgb":true,
            "std":255
        },
        "output": {
            "best":3
        }
    },
    "data":["./resnet50_torch/cat.jpg"]
}

It fails with the following error:

{"status":{"code":500,"msg":"InternalError","dd_code":1007,"dd_msg":"in get()"}}%


And when everything works smoothly and I change this time the best parameter I can get errors as well.
- By putting it to -1 I have no error but empty predictions. I am concerned about this because I would rather have an error or something like Tensort or caffe in the api. -1 means get all results.

{"status":{"code":200,"msg":"OK"},"head":{"method":"/predict","service":"torch_resnet","time":125.0},"body":{"predictions":[{"classes":[],"uri":"/PATHTO/resnet50_torch/cat.jpg"}]}}%

- By putting 0 I get the same thing, it would be nice to have an error message saying best_match must be > 0 or equals to -1 (with -1 returning all results)

TL;DR:
beniz commented 5 years ago

Hi, thanks.

YaYaB commented 5 years ago

Hello, Thanks for the quick reply.

beniz commented 5 years ago
* [here](https://pytorch.org/docs/stable/torchvision/models.html) you can see that the std vector is composed of floats

Note that these values assume input values are in [0,1] before normalization, which is not the case here, they need to be scaled back with a factor 255.

YaYaB commented 5 years ago

Yes, no problem on that. However when you normalize using std, your int will become float anyway so I don't see the point keeping as an it. Moroever even if caffe deploy.prototxt you have float means ^^ Here https://github.com/jolibrain/deepdetect/blob/master/src/backends/torch/torchinputconns.h#L108 you divide the image by the std obtaining an image of floats. Forcing std to int (meaning that you truncate or round the real value of the std) makes you only lose precision in your normalization.

beniz commented 5 years ago

See #611 changes with the introduction of scale, and std has been moved to a vector of float/double.

Closing for now as solved on our side. Thanks for digging into these issues!

YaYaB commented 4 years ago

It was #661 and not #611. Moreover, it is not yet merged, is it?