Difference between previous paper?

jiahaoLjh / HumanDepth

Code for "HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization"

MIT License

28 stars 3 forks source link

Difference between previous paper? #3

Open YangJae96 opened 4 years ago

YangJae96 commented 4 years ago

Hello. Thank you for your great work!!

Compared to the paper in ICCV 2019 Paper, Moon et.al, did you only changed the RootNet part to your approach HDNet which estimates the (Xr, Yr,ZR)?

Is the detection part(Mask-RCNN), root-relative 3D pose(PoseNet) and 3D Pose Visualization are all same as the ICCV 2019 paper??

jiahaoLjh commented 4 years ago

@YangJae96,

Yes. The bounding box detector and the root-relative 3D pose estimator are the same as in the ICCV paper.

YangJae96 commented 4 years ago

Thanks!!

Is there a demo code to obtain Roots from my custom images?? root

I want to see the roots like this! Should I change the code?!

jiahaoLjh commented 4 years ago

@YangJae96,

For inference on custom images, you will have to edit the data loader accordingly in data/dataset.py to provide both the image and the bounding box. See #2.

Both the 2D pose and the root joint depth are produced as the output of the model https://github.com/jiahaoLjh/HumanDepth/blob/fba1c6669d09418b1a4bd648a9f4021821ca4037/test.py#L99-L100 which you may consider visualizing with your own code.

YangJae96 commented 4 years ago

@jiahaoLjh ,

Sorry. I could not understand the reason to fix the dataset.py.

Isn't dataset.py only for Human36M preprocessing? If I want to inference my own image(when I have the BBox of humans from Detectron2), could I just put my image and BBox into the model and get the outputs of 2D joint and root joint Depth??

model I checked the model input part but its difficult to make in general.

Is the BBox_mask the Bounding box coordinates(x,y) of a one person in an image??
I can see the coord_map is for "Normalized image coordinates with focal length fx, fy divided from original image coordinates". But if I don't know the focal length of an image, is there a way to make the input to the model??

Thanks in advance!!

jiahaoLjh commented 4 years ago

@YangJae96

dataset.py is for preparing input data for the model. You could edit this file to replace data from Human36M with your own image samples. To do that, you need to provide both the image and one bounding box each time (for multi-person case). bbox_masks is simply a binary mask indicating the region of a bounding box.

If you don't know the focal length, you can simply set a reasonable one by yourself. The coord_map is taken care by dataset.py which you don't have to prepare by yourself.