Is it possible to predict 6D Pose from one single Image? I am a little confused

chensong1995 / HybridPose

HybridPose: 6D Object Pose Estimation under Hybrid Representation (CVPR 2020)

MIT License

412 stars 64 forks source link

Is it possible to predict 6D Pose from one single Image? I am a little confused #49

Open 1208overlord opened 3 years ago

1208overlord commented 3 years ago

Hello, I wanted to predict the pose of my object(pole) . And then, many ways are based on depth image or point cloud data. Unfortunately, I couldn't get those data. I just can get single images captured by normal camera. I tried to use 3D object detection at first, but failed several times. I found 6D pose estimation possible, I think. And then, I want to know whether it is possible to predict the 6D pose of my object(pole) from one single image. If yes, how can I train them? I have more than 10000 images. Thank you in advance.

chensong1995 commented 3 years ago

Hello 1208overload,

Thank you for your interest in our work. Most (if not all) RGB-based 6D pose estimation pipelines assume that we have a precise 3D model of the object we are interested in, as well as the camera intrinsic matrix K. If these are not available in your problem setup, then you cannot easily apply HybridPose.

If you do have that information, you will need to create labels on your images. That includes ground-truth object poses, segmentation masks, keypoints, and symmetry correspondences. I'd invite you to play around with our code on Linemod first to obtain a general idea of what each representation is like, and then try to label your own dataset.

I hope this helps.

1208overlord commented 3 years ago

@chensong1995 Thank you for your answer. One question more. I attached 2 images here. Those 2 images have pole all, but those poles are not same pole, which are located in different location. In this case, 3D model of these 2 poles are different.

Does HybridPose work for this case, too?
And if I don't know camera intrinsic matrix, can I use rough matirx , for example, image_width for fx, image height for fx, and etc?
When making dataset for training, images are sufficient to make dataset?
When predicting, can I predict 6D pose from one single image without other information?

I hope you give answers for these questions. Thanks again in advance.

chensong1995 commented 3 years ago

Hello 1208overlord,

I am unable to see your images for some reason. Can you double-check if you have attached them?

1208overlord commented 3 years ago

![Uploading 2.jpg…]()

1208overlord commented 3 years ago

![Uploading 38be633a-46ee-4eb3-9bdb-ed2938a52a14.jpg…]()

1208overlord commented 3 years ago

53e26855-6546-4aa3-93af-39acf9dc7fab 75c56c19-7d83-4bf6-b128-cb98337b2988 100dd957-7865-466b-bab7-921ab633d850 319d2e67-dc10-4e6e-9c95-102741b89050 a0417adc-7787-43e0-8bc2-fbade4f2f34d b0a02c6c-4b24-402b-a609-a963fca1dc42 c488ee76-c6ad-4888-b4c5-fb11f464458a c956bd41-c684-4678-9491-3e757b3a581d d50e1fce-4f24-4ca1-b6a9-4231acc9081d

1208overlord commented 3 years ago

Sorry for sending like this. Please see this images. I want to apply to these images for pole. Is it possible to apply to these images? how can I make dataset? I have only images.

1208overlord commented 3 years ago

Hello, how is it going? Will it be possible to get 6D pose for this pole from only one image? and if yes, how can I train them only with image?

chensong1995 commented 3 years ago

Hello 1208overload,

You cannot apply 6D pose estimation in your problem. 6D pose estimation requires a precise 3D model of the object, which is not available here. There are some 3D detection papers on the Pascal 3D+ dataset (e.g. https://github.com/xingyizhou/StarMap), which may be helpful to you.

Let's think about this problem carefully. There is a clear rotation symmetry along the vertical axis, and the object is almost certainly sitting on a flat surface. So estimating the rotation matrix does not make much sense to me. We can try to estimate the object scale and offset in 3D. As a starting point, you can try constructing a model with Faster-RCNN, which first locates the RoI around the object, and then direct regresses the scale and offset. Since the object is rather simple, this method may work pretty well. In case it does not, we can discuss further once you have some initial results.

I hope this helps.

1208overlord commented 3 years ago

My final aim is to measure the correct height of utility pole. And I tried with 2d object detection and segmentation, but it didn't account for angle of pole. That's why, I tried to calculate angle of pole and use it to correct the measurement.

Perhaps, can we have a skype contact to help me?

chensong1995 commented 3 years ago

In this case, I don't think you need a pose estimation pipeline. You can try direct regression of the object height.

Please send me an email if you would like to set up an appointment and discuss your project in a private setting.

1208overlord commented 3 years ago

akulov.eugen@gmail.com