Open RAOMMA opened 2 years ago
As a guess, you haven't registered your custom dataset and set the stuff_classes. Something like:
for d in train, val:
DatasetCatalog.register(f'floor_Images_data6_separated_{d}', lambda d=d: load_data(json_dir, d))
MetadataCatalog.get(f'floor_Images_data6_separated_{d}').set(stuff_classes=['floor'])
Hi, I am training the Panoptic model on my custom dataset and registering my dataset using panoptic separated. When I try to run it on a GPU for training It throws "RuntimeError: CUDA error: device-side assert triggered" when I run the training on the CPU it completes the training successfully but at the time of evaluation it gives the mentioned error.
Instructions To Reproduce the 🐛 Bug:
cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file("COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml")) cfg.DATASETS.TRAIN = ("floor_Images_data6_separated",) cfg.MODEL.DEVICE = "cpu" cfg.DATASETS.TEST = () cfg.DATALOADER.NUM_WORKERS = 2 cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml") # Let training initialize from model zoo cfg.SOLVER.IMS_PER_BATCH = 2 # This is the real "batch size" commonly known to deep learning people cfg.SOLVER.BASE_LR = 0.00025 # pick a good LR cfg.SOLVER.MAX_ITER = 300 # 300 iterations seems good enough for this toy dataset; you will need to train longer for a practical dataset cfg.SOLVER.STEPS = [] # do not decay learning rate cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 # The "RoIHead batch size". 128 is faster, and good enough for this toy dataset (default: 512) cfg.MODEL.ROI_HEADS.NUM_CLASSES = 189 cfg.MODEL.SEM_SEG_HEAD.NUM_CLASSES = 189 # only has one class (ballon). (see https://detectron2.readthedocs.io/tutorials/datasets.html#update-the-config-for-new-datasets)
NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrect uses num_classes+1 here.
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True) trainer = DefaultTrainer(cfg) trainer.resume_or_load(resume=False) trainer.train()
Evaluation Code cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "/content/output/model_final.pth") cfg.DATASETS.TEST = ("floor_Images_data6_separated")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set the testing threshold for this model
cfg.MODEL.DEVICE = "cpu" predictor = DefaultPredictor(cfg) test_metadata = MetadataCatalog.get("floor_Images_data6_separated") im = cv2.imread("/content/0E48EFF9-3A4F-4B64-BDCC-F6F6E180B3FB.jpeg") panoptic_seg, segments_info = predictor(im)["panoptic_seg"] v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = v.draw_panoptic_seg_predictions(panoptic_seg.to("cpu"), segments_info) cv2_imshow(out.get_image()[:, :, ::-1])
If making changes to the project itself, please use output of the following command: git rev-parse HEAD; git diff