Closed andrefdre closed 1 year ago
When researching about this, we need to find the intrinsic of the camera. We tried looking at the open3D to get the values, but it gave 0. Used the following link to achieve this, http://www.open3d.org/docs/0.12.0/python_api/open3d.camera.PinholeCameraIntrinsic.html#open3d.camera.PinholeCameraIntrinsic.get_focal_length.
I found that MeshLab has the camera values, and we are able to export them. But when using this values it didn't give expected results, and since we don't have the camera we can't ascertain if the problem is code or the stored values.
After contacting the professor, we found that our approach was correct, but we don't have camera parameters. Teacher suggestions was:
Hi @andrefdre ,
I have contacted the washington RGBD dataset people also. Lets give them some time to answer.
In the meantime I think you can improve each of the separate modules, i.e., object detection from point clouds, and object classification from RGB images.
If all goes wrong you will present these two modules separately ... but I still have hope we can find a way.
So right now our next approach is to use kinect camera where we can get camera parameters and either use a CNN for object detection to isolate the images from the scenes or just do print screens.
This is in standby while #28 is being worked upon.
Now that messages were sent, we need to figure out why the bounding boxes are off the center point.
This was implemented, and the results weren't promising for the network, showing previous expectations mentioned in #2 showing a need to add depth images to the model. The image below shows the results from one scene.
Now that we were able to process the point cloud and extract the object, we need to find a way to find them in the corresponding image of the scene.