Closed bibekyess closed 2 years ago
Personally, I collected my data synthetically (because manually annotating poses on real images are troublesome, even in AR using toolkits like ARCore) using Blender to get the object poses and intrinsics. You can also wait for the original authors to release their OnePose Cap app if you're patient enough.
For dataset collection:
For (2), it seems that you want to do real-time inference. You can try ripping out the pipeline from inference.py or inference_demo.py and put it through a cv2.VideoCapture while Loop.
For (3), the paper obtains the object pose using solvePnP on matched 2d-3d correspondences. Just try to find the line with eval_utils.ransac_PnP(...). I believe line 155 of inference.py gives you the object pose as pose_pred_homo.
Edit: I must also add that there is no further need to train the GATsSPG model, it is ready to be used with your point-cloud 3d model.
(1) I want to train and test this on the custom dataset. But, I couldn't find sufficient information on how to make a custom dataset by self. Can anyone who has done it before, help me by giving some hints?
Hi, thanks for your interest in our work. For Q(1) and (2) pelase stay tuned for the codes on custom dataset and online tracking. For Q(3), please refer to the replay of @siatheindochinese
@siatheindochinese Can you please elaborate on how you collected the data synthetically? Any particular library you used? And where did you find the 3D assets?
@aditya1709 I generated all my images, poses, re-projected bounding box coordinates in Blender. It allows python scripting, so a lot of work can be automated.
For 3D models, you can either 3D scan your models using 3D scanners (or other equivalent photogrammetry tools i.e NVIDIA MoMa) or manually model them yourself.
Hi @siatheindochinese Thank you for your detailed response! :) I have one more quick question. How is the accuracy or how is the model performing by training only with the synthetic dataset and testing with real pictures?
@bibekyess as long as the synthetic images are photorealistic enough the results should be sufficient.
I must reiterate that the synthetic images are only used to construct the SfM model, not used to train the GATsSPG.
You can check out my realtime result here, : https://www.linkedin.com/posts/sia-zhen-hao_opencv-computervision-machinelearning-activity-6965918457116733440-IAXE
For the video above, I did not implement optical flow tracking or object detection, thus the result is a bit shaky. Object-detection should help you crop out unnecessary SuperPoints, just use an off-the-shelf detector like YoloV5. The feature-matching object-detector provided in this repo is too slow for realtime inferencing.
@siatheindochinese Wow! Thanks for sharing the demo link and your explanation! I was wondering if you have tried onepose to detect 6D pose for multiple objects in the same frame? Like I want to show 3D bounding boxes for 5 objects in a single video. Is this framework reasonably feasible to do so?
@aditya1709 I generated all my images, poses, re-projected bounding box coordinates in Blender. It allows python scripting, so a lot of work can be automated.
@siatheindochinese Have you by any chance put the blender scripts on your github where I can take a look? Might be a good starting point for me.
@aditya1709 you can check out BlenderProc2, it is tailored to generate synthetic data for computer vision tasks. I would not recommend manually writing out the entire rendering pipeline using vanilla Blender.
Hello @aditya1709, beside the folders you mentioned above, additionally, there is a file called "box3dcornres.txt" for every object. What kind of parameters includes?
@siatheindochinese Wow! Thanks for sharing the demo link and your explanation! I was wondering if you have tried onepose to detect 6D pose for multiple objects in the same frame? Like I want to show 3D bounding boxes for 5 objects in a single video. Is this framework reasonably feasible to do so?
Hi @bibekyess, did you get an answer for that? I want to run it for multiple objects in a single frame as well. Would be super helpful to know if you tried it. Thanks!
Hello, Thanks for the awesome work. I have couple of questions: (1) I want to train and test this on the custom dataset. But, I couldn't find sufficient information on how to make a custom dataset by self. Can anyone who has done it before, help me by giving some hints? (2) I ran the code on the 'sample_data', and I found that we first do pose estimation and then save the results on one folder and then run another script to do the visualization. Can't I do pose estimation and visualization concurrently? I saw that
Demo pipeline for running OnePose with custom-captured data including the online tracking module.
will be updated soon. But, I am just wondering if the code for online tracking is already available on this repo? (3) When viewing the results, I saw that I get 3D bounding boxes but I couldn't see the orientation information on the results. From reading the paper, it says 6D-pose estimation which means 3 translation and 3 orientation, isn't it? Does the existing codes give information about orientation or not?Thank you for your time and help!! 🙂