andrefdre / Dora_the_mug_finder_SAVI

Dora The Mug Finder: Detection and classification of items placed on top of a table using point cloud processing and neural networks.
GNU General Public License v3.0
5 stars 1 forks source link

Finding object pose in the images to then send them to the model #22

Closed andrefdre closed 1 year ago

andrefdre commented 1 year ago

Now that we were able to process the point cloud and extract the object, we need to find a way to find them in the corresponding image of the scene.

andrefdre commented 1 year ago

When researching about this, we need to find the intrinsic of the camera. We tried looking at the open3D to get the values, but it gave 0. Used the following link to achieve this, http://www.open3d.org/docs/0.12.0/python_api/open3d.camera.PinholeCameraIntrinsic.html#open3d.camera.PinholeCameraIntrinsic.get_focal_length.

I found that MeshLab has the camera values, and we are able to export them. But when using this values it didn't give expected results, and since we don't have the camera we can't ascertain if the problem is code or the stored values.

andrefdre commented 1 year ago

After contacting the professor, we found that our approach was correct, but we don't have camera parameters. Teacher suggestions was:

Hi @andrefdre ,

I have contacted the washington RGBD dataset people also. Lets give them some time to answer.

In the meantime I think you can improve each of the separate modules, i.e., object detection from point clouds, and object classification from RGB images.

If all goes wrong you will present these two modules separately ... but I still have hope we can find a way.

So right now our next approach is to use kinect camera where we can get camera parameters and either use a CNN for object detection to isolate the images from the scenes or just do print screens.

andrefdre commented 1 year ago

This is in standby while #28 is being worked upon.

andrefdre commented 1 year ago

Now that messages were sent, we need to figure out why the bounding boxes are off the center point.

andrefdre commented 1 year ago

This was implemented, and the results weren't promising for the network, showing previous expectations mentioned in #2 showing a need to add depth images to the model. The image below shows the results from one scene. Screenshot from 2023-01-17 20-51-04