Open z7r7y7 opened 8 months ago
Yes, currently PoET does not support the training of the backbone. We intended PoET to be an extension to any pre-trained backbone.
If you want to use Mask R-CNN as the backbone for training on your custom dataset, you should first pre-train it separately on your dataset for object detection. Once you have the network trained you can include the pre-trained weights in the PoET training with the argument --backbone_weights
.
Hope this helps you!
Best, Thomas
Thank you for your reply! I noticed that there are many versions of Mask R-CNN on GitHub. May I ask which version's weights can be directly loaded with the argument --backbone_weights
?
You can check it out in the backbone_maskrcnn.py file. As of now you can use the model how it is provided by PyTorch with a ResNet-50 backbone.
However, you extend the code to use any object detector backbone you want as long as you return the necessary feature maps and detected object.
Thank you for your reply! I want to incorporate depth information into the input. Can I use the detection results of an object detection model and fuse them with the output of another backbone network that contains depth information?
Does the backbone network also contain RGB information? In general you can do that. The object detections do not have to come from the same network that does provide the feature maps.
However, I think 6D relative object pose estimation, purely based on depth images might be difficult.
On the other hand, combining RGB information with depth information should improve the performance.
You're right. Due to the presence of objects with similar shapes but varying sizes in my custom dataset, and the uncertainty of scale in monocular RGB images, relying solely on RGB images may yield suboptimal results. Therefore, I intend to fuse depth information with RGB information as input to the network.
I don't see any limitation regarding the transformer part to process a combination of RGB and depth feature maps. Therefore, if you have a backbone network that produces such feature maps, it should work out!
Let me know how it goes!
Best, Thomas
Thank you so much for your response. I will definitely try incorporating the RGB and depth feature maps into the Transformer model and see how it performs. I'm excited about the possibilities! If I make any progress or have any further questions, I would be delighted to continue the conversation with you. Your support and interest mean a lot to me. Best, Ruiyun
Definitely, I would be happy to continue the discussion and help you out whenever needed!
Best, Thomas
Thank you for providing this excellent project! I would like to train on my custom dataset. In the backbone.py file, I noticed that setting
self[0].train_backbone
toTrue
results in aNotImplementedError
. Does this mean that the backbone is currently not trainable? If I want to use Mask R-CNN as the backbone for training on my custom dataset, what steps should I follow?