facebookresearch / detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
https://detectron2.readthedocs.io/en/latest/
Apache License 2.0
30.41k stars 7.47k forks source link

Custom Dataset raises Cannot create polygons: Expect a list of polygons per instance #1553

Closed deepaksinghcv closed 4 years ago

deepaksinghcv commented 4 years ago

If you do not know the root cause of the problem, and wish someone to help you, please post according to this template:

Instructions To Reproduce the Issue:

  1. what code you wrote or what changes you made (git diff)
    NO CHANGES TO CODE BASE
  2. what exact command you run:
    python idd_trainer.py
  3. what you observed (including full logs):
    
    (dev) dksingh@gnode16:~/inseg3/detectron2$ python idd_trainer.py
    <Model architecture of MASK-RCNN GOES HERE>
    [06/07 17:18:55 d2.data.datasets.coco]: Loading /ssd_scratch/cvit/dksingh/idd/annotations/instancesonly_filtered_gtFine_train.json takes 1.08 seconds.
    WARNING [06/07 17:18:55 d2.data.datasets.coco]:
    Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.
[06/07 17:18:55 d2.data.datasets.coco]: Loaded 6993 images in COCO format from /ssd_scratch/cvit/dksingh/idd/annotations/instancesonly_filtered_gtFine_train.json [06/07 17:18:56 d2.data.build]: Removed 49 images with no usable annotations. 6944 images left. [06/07 17:18:56 d2.data.build]: Distribution of instances among all 9 categories: category #instances category #instances category #instances
person 23117 rider 27888 motorcycle 31852
bicycle 749 autorickshaw 9085 car 34451
truck 7474 bus 5192 vehicle fal.. 11411
total 151219

[06/07 17:18:56 d2.data.common]: Serializing 6944 elements to byte tensors and concatenating them all ... [06/07 17:18:56 d2.data.common]: Serialized dataset takes 45.41 MiB [06/07 17:18:56 d2.data.detection_utils]: TransformGens used in training: [ResizeShortestEdge(short_edge_length=(800, 832, 864, 896, 928, 960, 992, 1024), max_size=2048, sample_style='choice'), RandomFlip()] [06/07 17:18:56 d2.data.build]: Using training sampler TrainingSampler Unable to load 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (10, 1024) in the model! Unable to load 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (10,) in the model! Unable to load 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (36, 1024) in the model! Unable to load 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (36,) in the model! Unable to load 'roi_heads.mask_head.predictor.weight' to the model due to incompatible shapes: (80, 256, 1, 1) in the checkpoint but (9, 256, 1, 1) in the model! Unable to load 'roi_heads.mask_head.predictor.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (9,) in the model! [06/07 17:18:57 d2.engine.train_loop]: Starting training from iteration 0 ERROR [06/07 17:18:57 d2.engine.train_loop]: Exception during training: Traceback (most recent call last): File "/home/dksingh/inseg3/detectron2/detectron2/engine/train_loop.py", line 132, in train self.run_step() File "/home/dksingh/inseg3/detectron2/detectron2/engine/train_loop.py", line 209, in run_step data = next(self._data_loader_iter) File "/home/dksingh/inseg3/detectron2/detectron2/data/common.py", line 142, in iter for d in self.dataset: File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise raise self.exc_type(msg) AssertionError: Caught AssertionError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/dksingh/inseg3/detectron2/detectron2/data/common.py", line 41, in getitem data = self._map_func(self._dataset[cur_idx]) File "/home/dksingh/inseg3/detectron2/detectron2/utils/serialize.py", line 23, in call return self._obj(*args, **kwargs) File "/home/dksingh/inseg3/detectron2/detectron2/data/dataset_mapper.py", line 139, in call annos, image_shape, mask_format=self.mask_format File "/home/dksingh/inseg3/detectron2/detectron2/data/detection_utils.py", line 315, in annotations_to_instances masks = PolygonMasks(segms) File "/home/dksingh/inseg3/detectron2/detectron2/structures/masks.py", line 271, in init process_polygons(polygons_per_instance) for polygons_per_instance in polygons File "/home/dksingh/inseg3/detectron2/detectron2/structures/masks.py", line 271, in process_polygons(polygons_per_instance) for polygons_per_instance in polygons File "/home/dksingh/inseg3/detectron2/detectron2/structures/masks.py", line 262, in process_polygons "Got '{}' instead.".format(type(polygons_per_instance)) AssertionError: Cannot create polygons: Expect a list of polygons per instance. Got '<class 'numpy.ndarray'>' instead.

[06/07 17:18:57 d2.engine.hooks]: Total training time: 0:00:00 (0:00:00 on hooks) Traceback (most recent call last): File "idd_trainer.py", line 33, in trainer.train() File "/home/dksingh/inseg3/detectron2/detectron2/engine/defaults.py", line 402, in train super().train(self.start_iter, self.max_iter) File "/home/dksingh/inseg3/detectron2/detectron2/engine/train_loop.py", line 132, in train self.run_step() File "/home/dksingh/inseg3/detectron2/detectron2/engine/train_loop.py", line 209, in run_step data = next(self._data_loader_iter) File "/home/dksingh/inseg3/detectron2/detectron2/data/common.py", line 142, in iter for d in self.dataset: File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise raise self.exc_type(msg) AssertionError: Caught AssertionError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/dksingh/inseg3/detectron2/detectron2/data/common.py", line 41, in getitem data = self._map_func(self._dataset[cur_idx]) File "/home/dksingh/inseg3/detectron2/detectron2/utils/serialize.py", line 23, in call return self._obj(*args, **kwargs) File "/home/dksingh/inseg3/detectron2/detectron2/data/dataset_mapper.py", line 139, in call annos, image_shape, mask_format=self.mask_format File "/home/dksingh/inseg3/detectron2/detectron2/data/detection_utils.py", line 315, in annotations_to_instances masks = PolygonMasks(segms) File "/home/dksingh/inseg3/detectron2/detectron2/structures/masks.py", line 271, in init process_polygons(polygons_per_instance) for polygons_per_instance in polygons File "/home/dksingh/inseg3/detectron2/detectron2/structures/masks.py", line 271, in process_polygons(polygons_per_instance) for polygons_per_instance in polygons File "/home/dksingh/inseg3/detectron2/detectron2/structures/masks.py", line 262, in process_polygons "Got '{}' instead.".format(type(polygons_per_instance)) AssertionError: Cannot create polygons: Expect a list of polygons per instance. Got '<class 'numpy.ndarray'>' instead.

4. please simplify the steps as much as possible so they do not require additional resources to
     run, such as a private dataset.

1. I went through https://detectron2.readthedocs.io/tutorials/datasets.html to register a custom dataset
2. My custom dataset is in COCO format with 9 classes.
3. I create a file called idd_trainer.py in detectron2's main directory:
The content is as follows:
```python
import os

from detectron2.data.datasets import register_coco_instances
register_coco_instances("idd_dataset_train", {}, "/ssd_scratch/cvit/dksingh/idd/annotations/instancesonly_filtered_gtFine_train.json", "/ssd_scratch/cvit/dksingh/idd/leftImg8bit/train")
register_coco_instances("idd_dataset_val", {}, "/ssd_scratch/cvit/dksingh/idd/annotations/instancesonly_filtered_gtFine_val.json", "/ssd_scratch/cvit/dksingh/idd/leftImg8bit/val")

from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2 import model_zoo

cfg = get_cfg()

cfg.merge_from_file("/home/dksingh/inseg3/detectron2/configs/IDD/mask_rcnn_R_50_FPN.yaml")
cfg.OUTPUT_DIR = "/home/dksingh/inseg3/detectron2/tools/idd_output"
# cfg.merge_from_file(model_zoo.get_config_file("/home/dksingh/inseg/detectron2/configs/IDD/mask_rcnn_R_50_FPN.yaml"))

cfg.DATALOADER.NUM_WORKERS = 2
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
trainer.train()

The config file for the model file is as follows: The config file is similar to the one available for Cityscapes/mask_rcnn_R_50_FPN.yaml I have just changed the NUM_CLASSES and DATASETS.TRAIN and DATASETS.TEST

_BASE_: "../Base-RCNN-FPN.yaml"
MODEL:
  # WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
  # For better, more stable performance initialize from COCO
  WEIGHTS: "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"
  MASK_ON: True
  ROI_HEADS:
    NUM_CLASSES: 9
# This is similar to the setting used in Mask R-CNN paper, Appendix A
# But there are some differences, e.g., we did not initialize the output
# layer using the corresponding classes from COCO
INPUT:
  MIN_SIZE_TRAIN: (800, 832, 864, 896, 928, 960, 992, 1024)
  MIN_SIZE_TRAIN_SAMPLING: "choice"
  MIN_SIZE_TEST: 1024
  MAX_SIZE_TRAIN: 2048
  MAX_SIZE_TEST: 2048
DATASETS:
  TRAIN: ("idd_dataset_train",)
  TEST: ("idd_dataset_val",)
SOLVER:
  BASE_LR: 0.01
  STEPS: (18000,)
  MAX_ITER: 24000
  IMS_PER_BATCH: 8
TEST:
  EVAL_PERIOD: 8000

Expected behavior:

I expected the code to crash since i haven't specified any GPUs. But it crashes before it starts training. Could you kindly tell me whats the right procedure to train or where is the mistake.

Environment:

----------------------  -------------------------------------------------------------------------------
sys.platform            linux
Python                  3.7.3 | packaged by conda-forge | (default, Jul  1 2019, 21:52:21) [GCC 7.3.0]
numpy                   1.16.4
detectron2              0.1.3 @/home/dksingh/inseg3/detectron2/detectron2
Compiler                GCC 5.5
CUDA compiler           CUDA 10.1
detectron2 arch flags   sm_61
DETECTRON2_ENV_MODULE   <not set>
PyTorch                 1.4.0 @/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torch
PyTorch debug build     False
GPU available           True
GPU 0,1,2,3             GeForce GTX 1080 Ti
CUDA_HOME               /opt/cuda
Pillow                  5.4.1
torchvision             0.5.0 @/home/dksingh/anaconda3/envs/dev/lib/python3.7/site-packages/torchvision
torchvision arch flags  sm_35, sm_50, sm_60, sm_70, sm_75
fvcore                  0.1.1.post20200607
cv2                     4.1.0
----------------------  -------------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,
ppwwyyxx commented 4 years ago

If your mask annotations are not polygons, but binary masks, cfg.INPUT.MASK_FORMAT should be "bitmask" instead of "polygon". We'll improve the documentation about it.

deepaksinghcv commented 4 years ago

I changed to "bitmask". It worked. Thank you.

OzgurMertEmir commented 1 year ago

If your mask annotations are not polygons, but binary masks, cfg.INPUT.MASK_FORMAT should be "bitmask" instead of "polygon". We'll improve the documentation about it.

I have the same issue but changing the mask format didn't fix it. Is there anything else I can do to fix it?