stefan-ainetter / grasp_det_seg_cnn

Code for ICRA21 paper "End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB".
BSD 3-Clause "New" or "Revised" License
133 stars 17 forks source link

model result #19

Closed uuu686 closed 1 year ago

uuu686 commented 2 years ago

In the results, there will only be one for each category. Is there any solution to solve the problem?

stefan-ainetter commented 2 years ago

Hi @uuu686, I am not sure if I understand your question correctly. The grasp detection branch predicts multiple grasp candidates for the objects in the scene. For visualization and evaluation, we take only the one grasp candidate with the highest confidence score for each category. Now, if there are multiple objects of the same category in the scene (e.g. two bananas), the final output will be a grasp candidate for only one banana, as both belong to the same category. If you think about grasping in practice, after successfully removing the one banana, the network will again predict grasp candidates for the other one which is left.

However, if you want to predict possible grasp candidates and segmentation masks for all objects in the scene regardless of the object category, you should look at our BMVC paper, where we perform class (or category) agnostic object instance segmentation and grasp detection, to predict grasp candidates and segmentation masks for all graspable objects in the scene.

Let me know if this answers your question.

uuu686 commented 2 years ago

Thanks for your reply, and the answer solved my question.
Moreover, I want to ask another question: If I want to train the model on my dataset, I should prepare the three folders: "Annotation", "rgb" and "seg_mask_labeled_combi". The "rgb" folder contains the original images; the "seg_mask_labeled_combi" folder contains the labels of segmentation; the "Annotation" folder contains grasp candidates,and each grasp candidate contains four coordinates. I'm confused that is there any requirement for the order of these four points. And I also want to ask what software did you use to do this annotation? Thank you, and I look forward to your apply.

stefan-ainetter commented 1 year ago

Yes, you are right, you need the rgb images, as well as the annotations for grasp candidates and segmentation. If you use the same folder structure as OCID_grasp, you can re-use most parts of the provided OCID_grasp dataloader.

The four coordinates represent the corners of the grasp candidate. The order is important, whereas the first point represents the top-left corner, then continuing in clock-wise order (same as Cornell grasp dataset).

We implemented our own annotation tool, which is not part of this repository. However, there is software available which might be interesting for you, e.g. here.

uuu686 commented 1 year ago

Thank you very much, and I find a tool https://github.com/cgvict/roLabelImg.The tool can be used to do the task. And I want to ask,when i annotate the grasp candidate, how should I annotate the rectangle, such as the width, the height, and the border. For example, when I'm annotating a pen, it has many grasp candidates, and these rectangles will have overlapping parts. How should I control its width, height, and the size of the overlapping area.

stefan-ainetter commented 1 year ago

It is up to you how you annotate your data. Overlapping grasp candidates are not a problem, as each grasp candidate is independent. The height of the rectangle depends on the graspable object; for the width you should keep in mind that it represents the parallel plates of your gripper.

Hope this information helps. I am closing this issue now, feel free to reopen if needed, or create a new issue for other questions.