IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.64k stars 4.83k forks source link

Is it possible to detect an empty shelf with depth detection or some feature of D435i? #5607

Closed suraneti closed 4 years ago

suraneti commented 4 years ago

Issue Description

I've searched an documentation or code example but no document related to my ongoing empty shelf detection project.

Required Info
Camera Model { D435i }
Firmware Version (5.11.6.250)
Operating System & Version {Linux (Ubuntu 16)
Kernel Version (Linux Only) (4.4)
Platform PC
SDK Version { 2.22.0 }
Language { python }
Segment {Image processing }
dorodnic commented 4 years ago

The D435i is an RGB-D camera, providing real-time point-cloud data. It can be used as a component in robotics, warehouse-automation and measurement products. There is no "detect-shelf" function in the SDK, these types of queries are too generic, with every OEM having its own set of constraints. The SDK is only offering access to camera capabilities (like point-cloud and other types of 3D data)

MartyG-RealSense commented 4 years ago

@suraneti A way to detect an empty shelf may be to put image-based tags (also known as fiducial markers) at the back of the shelf that will normally be covered by products on the shelf. If a camera can see and register the tag when scanning the shelf then this would indicate that the shelf is empty.

The T265 Tracking Camera can detect a type of tag called Apriltags. If you would like to continue using the D435i in your project, an option may be an 'AR marker' called Aruco.

In the link below is an example of an Aruco program designed to work with the D435.

https://github.com/Jphartogi/ipa_marker_detection

suraneti commented 4 years ago

@dorodnic, @MartyG-RealSense thanks for the commend, can you please guide me.

Because I've brought D435i, so I need to find some way to use it to detect an empty shelf.

If I install a D435i camera in front of a little shelf and detect from a depth changing in each step of a shelf, like full of items will be the blue color of depth and empty will be orange/red, if orange more than 70% then alert.

Can this idea work though python wrapper?

MartyG-RealSense commented 4 years ago

The first step of the idea that you have in mind may be to get a colorized depth image, like in the Python code in the tutorial linked to below.

https://github.com/IntelRealSense/librealsense/blob/jupyter/notebooks/distance_to_object.ipynb

image

The part about detecting the color percentage (or the 'dominant color', which is what you really need) is more complex and may involve incorporating vision software such as OpenCV into the project. I got some useful leads by googling for 'dominant color python''. Such as these examples:

https://stackoverflow.com/questions/52398237/how-to-find-the-dominant-color-in-images-using-python

https://adamspannbauer.github.io/2018/03/02/app-icon-dominant-colors/

kafan1986 commented 4 years ago

@dorodnic, @MartyG-RealSense thanks for the commend, can you please guide me.

Because I've brought D435i, so I need to find some way to use it to detect an empty shelf.

If I install a D435i camera in front of a little shelf and detect from a depth changing in each step of a shelf, like full of items will be the blue color of depth and empty will be orange/red, if orange more than 70% then alert.

Can this idea work though python wrapper?

Can you tell me how are you planning to get the image? I mean is the camera mounted somewhere at a constant static location like rooftop etc or it is supposed to be attached to some moving vehicle/robot etc.? As a suggested solution would need this information.

MartyG-RealSense commented 4 years ago

@kafan1986 Thanks for your response! As suraneti said "If I install a D435i camera in front of a little shelf", I inferred from this that the camera will be static and not on a mobile shelf scanning robot.

kafan1986 commented 4 years ago

@dorodnic, @MartyG-RealSense thanks for the commend, can you please guide me.

Because I've brought D435i, so I need to find some way to use it to detect an empty shelf.

If I install a D435i camera in front of a little shelf and detect from a depth changing in each step of a shelf, like full of items will be the blue color of depth and empty will be orange/red, if orange more than 70% then alert.

Can this idea work though python wrapper?

Colourizing depth is just for visualizing. You can work with direct depth values. Your approach is OK. But as I have done some work with the D415 and other depth cameras. Some of the issues that you will likely face: A) Background subtraction would need you to store a depth image of an empty shelf for future use. Your final algo will also need you to store ROI of each rack, as in real world the camera FOV will capture other parts also like people moving etc. Also, once you store this depth data, be sure that the rack and the camera don't move and are fixed at their position else you would need to get the empty rack depth again. B) Accuracy of depth distance measured from depth camera is inversely proportional to distance from camera. Given the cost of the camera and other hardware you might be looking for a solution that can capture at least a single rack to have decent ROI. A single rack on store would have multiple shelves, if the camera is fixed at a position say somewhere near the rooftop then the depth accuracy will be good near the top racks but the bottom racks not that much. C) Depth data itself will fluctuate by minor amount across frames, even when there is actually no real physical change. This can be handled by background subtraction algo by playing with "variable threshold" value but this would bring new problems as discussed in point "D". D) The depth change obtained through background subtraction will be OK, if the product dimension is big, say big cereal boxes, even with high "variable threshold" (to handle false positive due to noise), the difference in depth values will be good enough to predict empty vs non-empty shelf. But the products with smaller dimension say toothpaste box, single soap etc. will face problem as the depth change compared to empty shelf may not be high enough, specially if they are kept at a lower shelf i.e. at a longer z distance from camera.

So in all you will be engaged with tweaking some threshold values and it will always be a fight between reducing false positives and false negatives.

suraneti commented 4 years ago

@MartyG-RealSense @kafan1986 first of all, thanks for your support. I will try with "dominant color" and background subtraction.

I will update the progress.

P.S. I've found a empty shelf detection project only camera with object detection, should I go this way better?

MartyG-RealSense commented 4 years ago

I think you intended to link to this project by the same developer?

https://github.com/thom1178/supermarket-shelf-detection

Using Deep Neural Networks (DNN) for this purpose may be greater complexity than the task needs, compared to doing it with methods such as dominant color. If you are prepared to go to the effort of using a DNN then there's no reason not to do it if you are happy to make that effort and it provides the result that you need.

Using DNN for empty shelf detection may be good practice if you plan to expand the scope of the project at some future point to use machine learning to perform tasks such as recognizing and checking the stock items left remaining on the shelves.

An example of such shelf checking is the mobile retail robot Tally, which can navigate a store and scan the shelves. Tally uses a combination of RealSense depth sensor and Lidar sensor.

https://www.simberobotics.com/

MartyG-RealSense commented 4 years ago

A website very recently demonstrated 3D maze navigation (similar in concept to a retail store layout) using a RealSense D435 instead of lidar though.

https://github.com/IntelRealSense/librealsense/issues/5625

kafan1986 commented 4 years ago

@suraneti As @MartyG-RealSense already mentioned you can implement any solution that you like but it depends who are you developing it for and at what accuracy it is deemed good enough? If it is a college project maybe a deep learning based project might be OK as proof of concept but if it is a commercial product based on my experience, companies expect good accuracy. In such case, I can not suggest going for a "only deep learning" based solution.

My experience taught me in variable conditions where product will be varying, light conditions will be varying as well as position of products. A only deep learning project won't achieve more than 80-85% accuracy and that too this is best case scenario. simberobotics link shared by @MartyG-RealSense seems to use multiple data points to work with like depth, computer vision and RFID. It is really difficult and sometimes not possible to achieve high accuracy without using sensor based data.

For your use case a Lidar based moving robot, which can move around the store and can use a single lidar sensor and move the sensor vertically at different adjusted heights to match the different shelf level and then compare the depth to know empty product position, probably be a good approach and then adding a additional layer of computer vision based deep learning might increase additional accuracy. Also, you have IMU data that help with creating pre-layout of the store and that will help navigation in store etc. I am just throwing in some ideas, all the best to you.