Closed yijingru closed 4 years ago
Everything stated on https://detectron2.readthedocs.io/tutorials/datasets.html is quite clear for how to register a dataset. The colab tutorial for training on a custom dataset is straightforward too. No problem in training on custom datasets for other detectron2 models, but pointrend does not want to train! I have checked the colab tutorial for pointrend too, yet there must be something I did not get since no matter how hard I try my colab attempt of training pointrend fails all the time. I believe it has to do with pointrend cfg settings. Would you be so kind and provide us an example of those settings? Here's mine:
cfg = get_cfg()
point_rend.add_pointrend_config(cfg)
cfg.merge_from_file("detectron2_repo/projects/PointRend/configs/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco.yaml")
cfg.DATASETS.TRAIN = ("dataset_train",)
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = "detectron2://PointRend/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco/164955410/model_final_3c3198.pkl"
cfg.SOLVER.IMS_PER_BATCH = 4
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 300
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2 #nr classes + 1
cfg.MODEL.POINT_HEAD.NUM_CLASSES = 2
Update: while reading my own comment I saw this:
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2 #nr classes + 1
cfg.MODEL.POINT_HEAD.NUM_CLASSES = 2
Which is obviously the mistake. I was so focused to find the problem elsewhere that my eyes never noticed that.
I finally trained pointrend on a custom dataset :')
Hi, Can you please point out the mistake? While training the model on point rend cfg and weights, I am getting an error : RuntimeError: grid_sampler(): expected input and grid to have same dtype, but input has c10::Half and grid has float
The same cfg, dataset was trained successfully using the COCO-Segmentation MaskRCNN weights.
If you're using AMP, PointRend was not tested against float16 and likely will need some extra work to support float16.
Yup, that seems to be the issue. Thanks for the clarification.
Hi, Wanted to ask whether point-rend is currently supported only for default trainer? I wrote a custom trainer for custom transformation of input data which trains perfectly fine in detectron2. However, when I try to train point-rend with the custom trainer (extended default trainer), it throws an error regarding input polygon mask (even though it was set to bitmask in cfg file). Point-rend model trains perfectly fine with the same code when using the default trainer. Does point-rend support custom trainers? if yes, what am I doing wrong?
Code for Custom Trainer
class customTrainer(DefaultTrainer):
@classmethod
def build_test_loader(cls, cfg, dataset_name):
return build_detection_test_loader(cfg, dataset_name, mapper=DatasetMapper(cfg, False))
@classmethod
def build_train_loader(cls, cfg):
return build_detection_train_loader(cfg, mapper=custom_datasetMapper)
cfg = get_cfg()
Code for Cfg File
point_rend.add_pointrend_config(cfg)
cfg.INPUT.MAX_SIZE_TRAIN = 1440,
cfg.INPUT.MIN_SIZE_TRAIN = (800,),
cfg.INPUT.MASK_FORMAT= 'bitmask'
cfg.merge_from_file("detectron2_repo/projects/PointRend/configs/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco.yaml")
cfg.DATASETS.TRAIN = ("cpd_final",)
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 4
cfg.MODEL.WEIGHTS = "detectron2://PointRend/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco/164955410/model_final_edd263.pkl"
cfg.MODEL.ROI_HEADS.NUM_CLASSES = len(cpd_metadata.thing_classes)
cfg.MODEL.POINT_HEAD.NUM_CLASSES = len(cpd_metadata.thing_classes)
cfg.OUTPUT_DIR = './cpd'
cfg.SOLVER.MAX_ITER = 4000
cfg.SOLVER.STEPS = (2000,)
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.0025
cfg.SOLVER.WARMUP_ITERS = 500
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
Code for Custom Training which throws error
trainer = customTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
Code for Training which works fine
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
Error which is thrown when using custom trainer
ERROR [07/23 09:18:31 d2.engine.train_loop]: Exception during training: Traceback (most recent call last): File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 149, in train self.run_step() File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 497, in run_step self._trainer.run_step() File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 273, in run_step loss_dict = self.model(data) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, kwargs) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/modeling/metaarch/rcnn.py", line 163, in forward , detector_losses = self.roi_heads(images, features, proposals, gt_instances) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/modeling/roi_heads/roi_heads.py", line 735, in forward losses.update(self._forward_mask(features, proposals)) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/modeling/roi_heads/roi_heads.py", line 838, in _forward_mask return self.mask_head(features, instances) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, kwargs) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/projects/point_rend/mask_head.py", line 233, in forward point_coords, point_labels = self._sample_train_points(coarse_mask, instances) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/projects/point_rend/mask_head.py", line 285, in _sample_train_points point_labels = sample_point_labels(instances, point_coords_wrt_image) File "/opt/tljh/user/envs/pytorch/lib/python3.8/site-packages/detectron2/projects/point_rend/point_features.py", line 251, in sample_point_labels assert isinstance( AssertionError: Point head works with GT in 'bitmask' format. Set INPUT.MASK_FORMAT to 'bitmask'.
have you solved it? having the same issue with custom trainer
@Jain-Archit @yasindagasan The issue that leads to this error is not with custom trainers, but with custom mappers. DefaultTrainer(cfg) calls the default mapper DatasetMapper(cfg, is_train=True), and DatasetMapper in turn (among other things) calls utils.annotations_to_instances(annos, image_shape, mask_format=cfg.INPUT.MASK_FORMAT) to convert segmentation masks into either BitMasks or PolygonMasks (see detectron2/structures) instances, depending on INPUT.MASK_FORMAT. PointRend requires BitMasks, but if DatasetMapper is not called in a custom trainer, masks are passed as polygons, the AssertionError is thrown, and changing INPUT.MASK_FORMAT does not do anything.
In my case, my custom mapper did not do anything important besides adding augmentations, so I was able to replace it with:
DatasetMapper(cfg, is_train=True, augmentations = custom_transform_list)
and was able to train a PointRend model using a custom trainer. If your custom mapper does do something important, you may be able to rewrite it so that it reformats your training data using utils.annotations_to_instances. Hope this is helpful!
For me also same issue, the DefaultTrainer with pointrend works good, while using the Augmentation, I ran into issues, can you give the full solution, you have given : DatasetMapper(cfg, is_train=True, augmentations = custom_transform_list) What I have is - def custom_mapper_pointrend(dataset_dict): transform_list = [T.Resize((800,800)), T.RandomFlip(prob=0.5, horizontal=False, vertical=True), T.RandomFlip(prob=0.5, horizontal=True, vertical=False), ] mapper = DatasetMapper(cfg, is_train=True, augmentations=transform_list) return mapper
class CustomTrainerPointrend(DefaultTrainer): @classmethod def build_train_loader(cls, cfg): return build_detection_train_loader(cfg, mapper=custom_mapper_pointrend)
trainer = CustomTrainerPointrend(cfg) Rest of code is same - Now getting error - w, h = d["width"], d["height"]
TypeError: 'DatasetMapper' object is not subscriptable
@anirbankonar123 From the code that you provided, it seems as though your custom mapper only adds augmentations (this was also the case for my project!). If this is the case, you don't need to define a custom mapper at all - you can just define a custom transform list globally and pass it in args for build_detection_train_loader within your custom trainer. See below for my example:
custom_transform_list = [T.Resize((800,800)),
T.RandomFlip(prob=0.25, horizontal=False, vertical=True),
T.RandomFlip(prob=0.25, horizontal=True, vertical=False)]
class MyPointRendTrainer(DefaultTrainer):
@classmethod
def build_train_loader(cls, cfg):
return build_detection_train_loader(cfg, mapper=
DatasetMapper(cfg, is_train=True, recompute_boxes = True,
augmentations = custom_transform_list
),
)
#Any other custom trainer methods here
Also, a possible source for your error: I believe that a custom mapper function should return a formatted dataset dict rather than another DatasetMapper instance - see the Dataloader tutorial.
Thanks for the answer, it got resolved soon after, by similar code, as you hv shown. Its working ok now.
One more question : what is the fps obtained by pointrend frm detectron on real time video segmentation, do we hv a figure.
Thanks
On Thu, Jun 30, 2022, 22:53 MCSitar @.***> wrote:
@anirbankonar123 https://github.com/anirbankonar123 From the code that you provided, it seems as though your custom mapper only adds augmentations (this was also the case for my project!). If this is the case, you don't need to define a custom mapper at all - you can just define a custom transform list globally and pass it in args for build_detection_train_loader within your custom trainer. See below for my example:
custom_transform_list = [T.Resize((800,800)), T.RandomFlip(prob=0.25, horizontal=False, vertical=True), T.RandomFlip(prob=0.25, horizontal=True, vertical=False)]
class MyPointRendTrainer(DefaultTrainer):
@classmethod def build_train_loader(cls, cfg): return build_detection_train_loader(cfg, mapper= DatasetMapper(cfg, is_train=True, recompute_boxes = True, augmentations = custom_transform_list ), ) #Any other custom trainer methods here
Also, a possible source for your error: I believe that a custom mapper function should return a formatted dataset dict rather than another DatasetMapper instance - see the Dataloader tutorial https://detectron2.readthedocs.io/en/latest/tutorials/data_loading.html.
— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/detectron2/issues/1017#issuecomment-1171486443, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADIW7N3FDCQBZIB2YAPVLKLVRXJXXANCNFSM4LEV2TCQ . You are receiving this because you were mentioned.Message ID: @.***>
Not sure about Detectron2 PointRend video FPS; I am not aware of any public real time D2 video segmentation implementations although somebody has probably(?) done it before. Pixelib seems to be optimized for real-time video segmentation with PointRend and might be worth checking out if Detectron2 processing speed is lacking...
Thanks, thats true. Do we have sample custom code to train pixellib pointrend model. The sample shown on their site with Nature dataset does not seem to work properly.
@anirbankonar123 You are right that the PixelLib custom training demo is broken (it seems like a combination of dependency and Nature dataset integrity issues). The PixelLib PointRend video segmentation tutorial does still seem to work for me, and issues with the custom Mask-RCNN training demo might not be relevant because training custom PointRend models in PixelLib does not seem to be possible at all at present. The PointRend models implemented for PixelLib video segmentation are pre-trained Detectron2 PointRend models, so using custom-trained Detectron2 models in PixelLib seems viable. My very speculative idea for how you might be able to do this:
__init__
(currently only supports loading COCO classes and default Detectron2 pretrained models) to fit your new model(s) and custom dataset classes.Unfortunately a more complicated solution (possibly more so than just using Detectron2) than an quick look at demos and documentation would suggest. Best of luck with your project!
How to train the PointRend on a local dataset?