tensorlayer / HyperPose

Library for Fast and Flexible Human Pose Estimation
https://hyperpose.readthedocs.io
1.25k stars 275 forks source link

Pretrained models/newest_model.npz #314

Open sulgwyn opened 4 years ago

sulgwyn commented 4 years ago

Hi, I am unable to find the file newest_model.npz for Resnet50 backbone architecture. Is the pretrained models released? If so where can i find the .npz file? If not, when can i expect the pretrained models to be released for inference?

stubbb commented 4 years ago

Is this the one you are looking for:

https://drive.google.com/drive/folders/1w9EjMkrjxOmMw3Rf6fXXkiv_ge7M99jR

Gyx-One commented 4 years ago

Hello! Thanks for using our library! A lot of models are on the way to be released recently! ( with each of them has .npz , .pb and .onnx format) We already trained and evaluated them, and they will be uploaded with thier evaluation accuracy metrics together!

sulgwyn commented 4 years ago

Hi,

Looking forward to the release!

I tried to run the inference code using lw openpose model but I am getting this error: ValueError: Training / inference mode not defined. Argument is_train should be set as True / False. I noticed that the is_train argument is set to False in Model/../lw_openpose.py file. Can you tell me how to fix this?

Gyx-One commented 4 years ago

@sulgwyn sorry, I've been busy developing pifpaf in the past few weeks. I'll have a check for this problem recently :)

lengyuner commented 4 years ago

I have tried to run infer.py with lightweight_openpose_resnet50.npz and set the paremeter like this:

model_type ="LightweightOpenpose"
model_backbone ="Resnet50"
model_name ="default_name"

but it comes out:

\Hyperpose\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 1117, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (1, 1, 64, 256) and (64,) are incompatible
bhumikasinghrk commented 3 years ago

Hey @lengyuner ! I am also facing the same issue. Have you found any solution to this problem? I'll be very thankful if you could share the solution with me.

lengyuner commented 3 years ago

@bhumikasinghrk Because I am using this model on other datasets (smaller than coco), I retrain the model with randomly initialized weights.

bhumikasinghrk commented 3 years ago

@lengyuner thanks for your quick response. I am not trying to train the model, just executing infer.py on an image with Config.set_model_type(Config.MODEL.LightweightOpenpose) Config.set_model_backbone(Config.BACKBONE.Resnet50)

but it shows error in line 15 in infer.py ---->model.load_weights(weight_path)

ValueError:Cannot assign to variable block_1_1_ds_conv1/filters:0 due to variable shape (1, 1, 64, 256) and value shape (64,) are incompatible

lengyuner commented 3 years ago

@bhumikasinghrk If you just want to infer picture, try to use uff or onnx. Try this https://hyperpose.readthedocs.io/en/latest/markdown/quick_start/prediction.html

bhumikasinghrk commented 3 years ago

@lengyuner thanks for replying. I tried this method given in the hyperpose documentation. I wanted to know if we can get results by passing this onnx and uff model to infer.py, but it throws error saying only .npz and .hdf5 files are accepted. Also, when I get results on video or picture using hyperpose documentation it just gives an output video or image, i expected another file, like a txt or json file, which gives values of the coordinates of the points and detections in the video. Do you have any idea how to get that file?

lengyuner commented 3 years ago

@bhumikasinghrk

try this if you have got the onnx file (this document is written in Chinese. if you have difficulties, let me know)

https://www.jianshu.com/p/3a51f7d3357f

And change the code for your purpose.

( You maybe need 'Netron' during changing the code.

AndreyStille commented 3 years ago

@lengyuner thanks for replying. I tried this method given in the hyperpose documentation. I wanted to know if we can get results by passing this onnx and uff model to infer.py, but it throws error saying only .npz and .hdf5 files are accepted. Also, when I get results on video or picture using hyperpose documentation it just gives an output video or image, i expected another file, like a txt or json file, which gives values of the coordinates of the points and detections in the video. Do you have any idea how to get that file?

Hi. I faced same issue. Have you found the solution to load weights properly?

lengyuner commented 3 years ago

@AndreyStille

Hi. I faced same issue. Have you found the solution to load weights properly?

haven't

orestis-z commented 3 years ago

Dear @Gyx-One & @ganler, we can't reproduce any of the paper's results since the .npz files are missing and we can't load the only .npz file that was made available (lightweight OpenPose) as there's some shape mismatch as described in https://github.com/tensorlayer/hyperpose/issues/314#issuecomment-720447489 and https://github.com/tensorlayer/hyperpose/issues/348

Gyx-One commented 3 years ago

Dear @Gyx-One & @ganler, we can't reproduce any of the paper's results since the .npz files are missing and we can't load the only .npz file that was made available (lightweight OpenPose) as there's some shape mismatch as described in #314 (comment) and #348

Hello @orestis-z , Sorry to response so late, I've checked the issues you mentioned, and realize that the shape mismatch of the .npz file may due to a past change over the model channel format during one update. I'm checking all the .npz files shape matching problems now. All the speed performance result is achieved over the C++ inference engine that takes in the .onnx file, which should not have the shape mismatch problem. To reproduce the performance result, build and load the .onnx model using C++ inference engine. Notes:

  1. The .npz file is produced and can be loaded by the python code, while the .onnx file and the .pb file are converted from the .npz file and should be loaded by the c++ inference code. The python code is used to produce the model and perform accuracy test, while the .onnx file is used for the high-speed C++ model inference engine and perform speed test.
  2. infer.py is used to just check the model inference result in python and should not be used for performance test, and the APIs used in infer.py are going to be update. :) The .npz file, .pb file and the .onnx file of openpose model are already sent to @ganler to upload to the google drive to upload, and the shape mismatch problem will be fixed as soon as possible. Thanks
orestis-z commented 3 years ago

Dear @Gyx-One & @ganler, we can't reproduce any of the paper's results since the .npz files are missing and we can't load the only .npz file that was made available (lightweight OpenPose) as there's some shape mismatch as described in #314 (comment) and #348

Hello @orestis-z , Sorry to response so late, I've checked the issues you mentioned, and realize that the shape mismatch of the .npz file may due to a past change over the model channel format during one update. I'm checking all the .npz files shape matching problems now. All the speed performance result is achieved over the C++ inference engine that takes in the .onnx file, which should not have the shape mismatch problem. To reproduce the performance result, build and load the .onnx model using C++ inference engine. Notes:

  1. The .npz file is produced and can be loaded by the python code, while the .onnx file and the .pb file are converted from the .npz file and should be loaded by the c++ inference code. The python code is used to produce the model and perform accuracy test, while the .onnx file is used for the high-speed C++ model inference engine and perform speed test.
  2. infer.py is used to just check the model inference result in python and should not be used for performance test, and the APIs used in infer.py are going to be update. :) The .npz file, .pb file and the .onnx file of openpose model are already sent to @ganler to upload to the google drive to upload, and the shape mismatch problem will be fixed as soon as possible. Thanks

Thanks for your quick reply @Gyx-One

I've evaluated the C++ OpenPose version on COCO and achieved an accuracy of only ~30 AP. In addition, I've edited the .onnx graph to accept images with the resolution of the COCO dataset which increased the accuracy to ~42 AP. This version however, runs slower than the original OpenPose implementation from CMU. With the same settings, I achieve around ~60 AP with the original OpenPose from CMU.

Given this, I wanted to check if something is wrong with my evaluation script or if the Python version with 1 scale and small input resolution would lead to the same result.

Given your provided weights, I can't reproduce the accuracy that you've reported for OpenPose. Neither with the C++ nor with Python version.

I believe it would be fair to provide the .npz weights for all reported methods, so that we can reproduce them.

If we can reproduce them, I'll be very thankful for your great work and the amazing results, as you'd have engineered a system that can run in real-time and still perform well.

Gyx-One commented 3 years ago

Hello @orestis-z , Thank you for your work!

  1. For accuracy result The accuracy result we report over the paper is achieved in the python code using "eval.py", the process it uses is different from the c++ inference code directly using the inference engine, because in python code we follow the evaluation implementation of official lightweight openpose, which first pad and scale image to the standard image input size, then use multi-scale search in [0.5, 1.0, 1.5, 2.0], while the inference engine just resize the image to the model input size. I think that's the main reason that leads to the 30-40 AP using the inference code.
    I used to assume that the evaluation result of openpose is as same as lightweight openpose that uses multi-scale search, but if openpose achieves 60~ AP in one-shot, I think I should convert the original openpose model weight to check our post-processing procedure and compare the training process between hyperpose and openpose to find the reason.
  2. For performance result The performance result is achieved using the .onnx model without variant input shape, we takes the images in a fixed shape 368 x 656.
    After fixing the shape mismatch problem, I'll upload all the .npz model with correct shape to reproduce result using eval.py.
Gyx-One commented 3 years ago

Click the wrong place accidentally, sorry

orestis-z commented 3 years ago

@Gyx-One thanks for elaborating. I'm going to open a PR in the next few days with some C++ python bindings so you can also reproduce the results.

Gyx-One commented 3 years ago

@orestis-z Sure! Thanks again for your contribution! :) Sorry to response so late, because I've been handling other deadlines this week. The shape mismatch problem has been fixed over several models and should be finished and merged next week.

Gyx-One commented 3 years ago

Hello! @orestis-z The newest commit merged has already fix the shape mismatch issue. This commit also provides a python_demo.py demonstrating how to load the pretrained model weights and the usage of modulized processors.

I have already upload a series of npz_dict model weights to the google drive for downloading, the npz_dict model weights are fully tested to make sure they can be successfully load and run by the eval.py and the python_demon.py.

The evaluation command lines and the results are followed: Notes:

  1. Remember to put the model weight file to the path ./save_dir/$model_name$/model_dir/newest_model.npz :)
  2. The little accuracy difference(~1 AP) between the result below and the table at the homepage may due to the finetuning of the post processing hyper parameters. (here I just use the default hyper parameter setting.)

1.model_name: new_opps
model weight file name: openpose.npz evaluation command line: CUDA_VISIBLE_DEVICES=0 python eval.py --model_name=new_opps --model_type=Openpose --dataset_type=MSCOCO --dataset_version=2014 --eval_num=5000 --vis_num=50 evaluation results: (5000 images of 2014 mscoco val) image Notes: We do evaluation over mscoco 2014 dataset to follow the original openpose paper setting.

2.model_name: new_lopps model weight file name: lightweight_openpose.npz evaluation command line: CUDA_VISIBLE_DEVICES=0 python eval.py --model_name=new_lopps --model_type=LightweightOpenpose --dataset_type=MSCOCO --vis_num=50 evaluation results: (all 5000 images of 2017 mscoco val) image

3.model_name: new_lopps_resnet50 model weight file name: lightweight_openpose_resnet50.npz evaluation command line: CUDA_VISIBLE_DEVICES=0 python eval.py --model_name=new_lopps_resnet50 --model_type=LightweightOpenpose --model_backbone=Resnet50 --dataset_type=MSCOCO --vis_num=50 evaluation results: (all 5000 images of 2017 mscoco val) image

4.model_name: new_lopps_vggtiny model weight file name: lightweight_openpose_vggtiny.npz evaluation command line: CUDA_VISIBLE_DEVICES=0 python eval.py --model_name=new_lopps_vggtiny --model_type=LightweightOpenpose --model_backbone=Vggtiny --dataset_type=MSCOCO --vis_num=50 evaluation results: (all 5000 images of 2017 mscoco val) image

5.model_name: new_mbopps model weight file name: lightweight_openpose_mobilenetthin.npz evaluation command line: CUDA_VISIBLE_DEVICES=0 python eval.py --model_name=new_mbopps --model_type=MobilenetThinOpenpose --dataset_type=MSCOCO --vis_num=50 evaluation results: (all 5000 images of 2017 mscoco val) image

In the next days, I will try to convert the original openpose weight to npz_dict format and figure out the difference between our training procedure and openpose oirginal training procedure.

orestis-z commented 2 years ago

@Gyx-One Thanks for the update!

I didn't make an MR as I didn't have time to familiarize with your build system. But please find in the following the Python wrapper for hyperpose:

// Adapted from https://github.com/tensorlayer/hyperpose/blob/master/examples/cli.cpp
#include <assert.h>
#include <pybind11/numpy.h>
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <array>
#include <hyperpose/hyperpose.hpp>
#include <hyperpose/logging.hpp>
#include <ostream>
#include <stdexcept>
#include <string>
#include <vector>
#include "opencv2/core/mat.hpp"

namespace py = pybind11;

inline constexpr auto log_hp = []() -> std::ostream& {
  std::cout << "[HyperPose] ";
  return std::cout;
};

class ParserVariant {
 public:
  using var_t = std::variant<hyperpose::parser::pose_proposal, hyperpose::parser::paf, hyperpose::parser::pifpaf>;

  ParserVariant(var_t v) : parser_(std::move(v)) {}

  template<typename Container>
  const std::vector<hyperpose::human_t> process(Container&& featureMapContainers) {
    return std::visit([&featureMapContainers](auto& arg) { return arg.process(featureMapContainers); }, parser_);
  }

 private:
  var_t parser_;
};

class HyperPose {
  const bool keepRatio_;
  hyperpose::dnn::tensorrt engine_;
  ParserVariant parser_;

  static constexpr int MAX_BATCH_SIZE = 1;

  static hyperpose::dnn::tensorrt getEngine(const std::string& modelPath,
                                            const cv::Size& networkResolution,
                                            const bool keepRatio,
                                            const bool enableLogging) {
    if (enableLogging) hyperpose::enable_logging();

    hyperpose::info("Model: ", modelPath, "\n");

    constexpr std::string_view ONNX_SUFFIX = ".onnx";
    constexpr std::string_view UFF_SUFFIX = ".uff";

    if (std::equal(ONNX_SUFFIX.crbegin(), ONNX_SUFFIX.crend(), modelPath.crbegin()))
      return hyperpose::dnn::tensorrt(hyperpose::dnn::onnx{modelPath}, networkResolution, MAX_BATCH_SIZE, keepRatio);

    if (std::equal(UFF_SUFFIX.crbegin(), UFF_SUFFIX.crend(), modelPath.crbegin())) {
      hyperpose::warning(
          "For .uff model, the program only takes 'image' as input node, and "
          "'outputs/conf,outputs/paf' as output nodes.\n");
      return hyperpose::dnn::tensorrt(hyperpose::dnn::uff{modelPath, "image", {"outputs/conf", "outputs/paf"}},
                                      networkResolution,
                                      MAX_BATCH_SIZE,
                                      keepRatio);
    }

    hyperpose::warning("Your model file's suffix is not [.onnx | .uff]. Your model file path: ", modelPath, "\n");
    hyperpose::warning("We assume this is a serialized TensorRT model, and we'll evaluate it in this way.\n");

    return hyperpose::dnn::tensorrt(hyperpose::dnn::tensorrt_serialized{modelPath}, networkResolution, MAX_BATCH_SIZE, keepRatio);
  }

  static ParserVariant::var_t getParser(const std::string& postProcessingMethod, const cv::Size& inputSize) {
    if (postProcessingMethod == "paf") return hyperpose::parser::paf{};
    if (postProcessingMethod == "ppn") return hyperpose::parser::pose_proposal(inputSize);
    if (postProcessingMethod == "pifpaf") return hyperpose::parser::pifpaf(inputSize.height, inputSize.width);

    throw std::invalid_argument("Unknown post-processing method '" + postProcessingMethod +
                                "'. Use 'paf', 'ppn' or 'pifpaf'.");
  }

 public:
  HyperPose(const std::string& modelPath,
            const cv::Size& networkResolution,
            const bool keepRatio = true,
            const std::string& postProcessingMethod = "paf",
            const bool enableLogging = false) :
      keepRatio_(keepRatio),
      engine_{getEngine(modelPath, networkResolution, keepRatio, enableLogging)},
      parser_{getParser(postProcessingMethod, engine_.input_size())} {}

  HyperPose(const py::object& config) :
      HyperPose(config.attr("model_path").cast<std::string>(),
                {config.attr("network_resolution").attr("__getitem__")("width").cast<int>(),
                 config.attr("network_resolution").attr("__getitem__")("height").cast<int>()},
                config.attr("keep_ratio").cast<bool>(),
                config.attr("post_processing_method").cast<std::string>(),
                config.attr("enable_logging").cast<bool>()) {}

  const std::vector<hyperpose::human_t> infer(const cv::Mat& mat) {
    // * TensorRT Inference.
    const std::vector featureMaps = engine_.inference({mat});
    assert(featureMaps.size() == 1);

    // * Post-Processing.
    std::vector poses = parser_.process(featureMaps[0]);

    for (auto&& pose : poses) {
      if (keepRatio_) hyperpose::resume_ratio(pose, mat.size(), engine_.input_size());
      pose.score /= 100;  // convert from percentage to fraction format
    }

    return poses;
  }

  const std::vector<hyperpose::human_t> infer(const py::array_t<uint8_t>& mat) {
    auto rows = mat.shape(0);
    auto cols = mat.shape(1);
    auto type = CV_8UC3;

    const cv::Mat cvMat(rows, cols, type, const_cast<unsigned char*>(mat.data()));

    return infer(cvMat);
  }
};

PYBIND11_MODULE(hyperpose, module) {
  py::class_<HyperPose>(module, "HyperPose")
      .def(py::init<const py::object&>())
      .def("infer", py::overload_cast<const py::array_t<uint8_t>&>(&HyperPose::infer));

  py::class_<hyperpose::human_t>(module, "human_t")
      .def_readwrite("score", &hyperpose::human_t::score)
      .def_readwrite("parts", &hyperpose::human_t::parts);

  py::class_<hyperpose::body_part_t>(module, "body_part_t")
      .def_readwrite("has_value", &hyperpose::body_part_t::has_value)
      .def_readwrite("x", &hyperpose::body_part_t::x)
      .def_readwrite("y", &hyperpose::body_part_t::y)
      .def_readwrite("score", &hyperpose::body_part_t::score);
}

And cmake:

cmake_minimum_required(VERSION 3.1.0)
project(hyperpose_wrapper)

set(CMAKE_CXX_STANDARD 17)
set(Python_DISTLIB /usr/local/lib/python3.6/dist-packages/)
set(pybind11_DIR ${Python_DISTLIB}/pybind11/share/cmake/pybind11/)

find_package(
  OpenCV
  REQUIRED
)
find_package(
  pybind11
  CONFIG
  REQUIRED
)

include_directories(
  path/to/hyperpose/include
  ${Python_DISTLIB}/pybind11/include
)

pybind11_add_module(
  hyperpose_wrapper
  path/to/src/hyperpose_wrapper.cpp
)
target_link_libraries(
  hyperpose_wrapper
  PRIVATE path/to/hyperpose/lib/libhyperpose.so
          ${OpenCV_LIBS}
)

set_target_properties(
  hyperpose_wrapper
  PROPERTIES PREFIX
             ""
             LIBRARY_OUTPUT_DIRECTORY
             <shared lib output path>
)

Pybind11: you can install it with pip