No usable annotations with COCO format, problem with "iscrowd" attribute

matejfric commented 1 year ago

I'm trying to apply instance segmentation to images of the wood piles:

Screenshot from 2023-07-11 12-54-19

The dataset I'm using was annotated in CVAT and then exported to COCO 1.0 format. The problem is that in most cases the "iscrowd" parameter is set to 1 (true), and when there are no cases with the "iscrowd" parameter set to 0, the training fails with an error. Please see Logs and Google Colab notebook in sections below.

Note that when there are some "iscrowd": 0 attributes in the exported annotations, the training successfuly finishes, but the number of instances in the training log is equal to the number of instances with "iscrowd" attribute set to zero.

So the question is: Is this a desirable behaviour? Can I do something about it? I would be grateful for any suggestions. Given the nature of my data and the difficulty of annotating it, I would like to train the model on as many instances as possible.

Instructions To Reproduce the Issue:

Please see this notebook on Google Colab

Logs or other relevant observations:

This is the output when there are no cases with the "iscrowd" parameter set to 0.

[07/12 14:25:31 d2.data.datasets.coco]: Loaded 1 images in COCO format from annotations/sample_annotation.json
[07/12 14:25:31 d2.data.build]: Removed 1 images with no usable annotations. 0 images left.
[07/12 14:25:31 d2.data.build]: Distribution of instances among all 1 categories:
|  category  | #instances   |
|:----------:|:-------------|
|    log     | 0            |
|            |              |
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
[<ipython-input-8-a39d262f9c11>](https://localhost:8080/#) in <cell line: 25>()
     23 
     24 os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
---> 25 trainer = DefaultTrainer(cfg)
     26 trainer.resume_or_load(resume=False)
     27 trainer.train()

5 frames
[/content/detectron2/detectron2/data/build.py](https://localhost:8080/#) in get_detection_dataset_dicts(names, filter_empty, min_keypoints, proposal_files, check_consistency)
    277             pass
    278 
--> 279     assert len(dataset_dicts), "No valid data found in {}.".format(",".join(names))
    280     return dataset_dicts
    281 

AssertionError: No valid data found in oli_train.

If I manually change at least one of the "iscrowd" attributes to 0 (false), another error is raised:

[07/12 14:30:49 d2.data.datasets.coco]: Loaded 1 images in COCO format from annotations/sample_annotation.json
[07/12 14:30:49 d2.data.build]: Removed 0 images with no usable annotations. 1 images left.
[07/12 14:30:49 d2.data.build]: Distribution of instances among all 1 categories:
|  category  | #instances   |
|:----------:|:-------------|
|    log     | 1            |
|            |              |
[07/12 14:30:49 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[07/12 14:30:49 d2.data.build]: Using training sampler TrainingSampler
[07/12 14:30:49 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[07/12 14:30:49 d2.data.common]: Serializing 1 elements to byte tensors and concatenating them all ...
[07/12 14:30:49 d2.data.common]: Serialized dataset takes 0.06 MiB
[07/12 14:30:49 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl ...
model_final_f10217.pkl: 178MB [00:00, 209MB/s]                           
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (2, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (2,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (4, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (4,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.mask_head.predictor.weight' to the model due to incompatible shapes: (80, 256, 1, 1) in the checkpoint but (1, 256, 1, 1) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.mask_head.predictor.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (1,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
roi_heads.mask_head.predictor.{bias, weight}
[07/12 14:30:50 d2.engine.train_loop]: Starting training from iteration 0
ERROR [07/12 14:30:51 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
  File "/content/detectron2/detectron2/engine/train_loop.py", line 155, in train
    self.run_step()
  File "/content/detectron2/detectron2/engine/defaults.py", line 494, in run_step
    self._trainer.run_step()
  File "/content/detectron2/detectron2/engine/train_loop.py", line 297, in run_step
    data = next(self._data_loader_iter)
  File "/content/detectron2/detectron2/data/common.py", line 291, in __iter__
    for d in self.dataset:
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 633, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 644, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/content/detectron2/detectron2/data/detection_utils.py", line 416, in annotations_to_instances
    masks = PolygonMasks(segms)
  File "/content/detectron2/detectron2/structures/masks.py", line 309, in __init__
    self.polygons: List[List[np.ndarray]] = [
  File "/content/detectron2/detectron2/structures/masks.py", line 310, in <listcomp>
    process_polygons(polygons_per_instance) for polygons_per_instance in polygons
  File "/content/detectron2/detectron2/structures/masks.py", line 298, in process_polygons
    raise ValueError(
ValueError: Cannot create polygons: Expect a list of polygons per instance. Got '<class 'numpy.ndarray'>' instead.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 32, in fetch
    data.append(next(self.dataset_iter))
  File "/content/detectron2/detectron2/data/common.py", line 258, in __iter__
    yield self.dataset[idx]
  File "/content/detectron2/detectron2/data/common.py", line 95, in __getitem__
    data = self._map_func(self._dataset[cur_idx])
  File "/content/detectron2/detectron2/utils/serialize.py", line 26, in __call__
    return self._obj(*args, **kwargs)
  File "/content/detectron2/detectron2/data/dataset_mapper.py", line 189, in __call__
    self._transform_annotations(dataset_dict, transforms, image_shape)
  File "/content/detectron2/detectron2/data/dataset_mapper.py", line 131, in _transform_annotations
    instances = utils.annotations_to_instances(
  File "/content/detectron2/detectron2/data/detection_utils.py", line 418, in annotations_to_instances
    raise ValueError(
ValueError: Failed to use mask_format=='polygon' from the given annotations!

[07/12 14:30:51 d2.engine.hooks]: Total training time: 0:00:01 (0:00:00 on hooks)
[07/12 14:30:51 d2.utils.events]:  iter: 0       lr: N/A  max_mem: 345M
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-10-a39d262f9c11>](https://localhost:8080/#) in <cell line: 27>()
     25 trainer = DefaultTrainer(cfg)
     26 trainer.resume_or_load(resume=False)
---> 27 trainer.train()

8 frames
[/usr/local/lib/python3.10/dist-packages/torch/_utils.py](https://localhost:8080/#) in reraise(self)
    642             # instantiate since we don't know how to
    643             raise RuntimeError(msg) from None
--> 644         raise exception
    645 
    646 

ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/content/detectron2/detectron2/data/detection_utils.py", line 416, in annotations_to_instances
    masks = PolygonMasks(segms)
  File "/content/detectron2/detectron2/structures/masks.py", line 309, in __init__
    self.polygons: List[List[np.ndarray]] = [
  File "/content/detectron2/detectron2/structures/masks.py", line 310, in <listcomp>
    process_polygons(polygons_per_instance) for polygons_per_instance in polygons
  File "/content/detectron2/detectron2/structures/masks.py", line 298, in process_polygons
    raise ValueError(
ValueError: Cannot create polygons: Expect a list of polygons per instance. Got '<class 'numpy.ndarray'>' instead.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 32, in fetch
    data.append(next(self.dataset_iter))
  File "/content/detectron2/detectron2/data/common.py", line 258, in __iter__
    yield self.dataset[idx]
  File "/content/detectron2/detectron2/data/common.py", line 95, in __getitem__
    data = self._map_func(self._dataset[cur_idx])
  File "/content/detectron2/detectron2/utils/serialize.py", line 26, in __call__
    return self._obj(*args, **kwargs)
  File "/content/detectron2/detectron2/data/dataset_mapper.py", line 189, in __call__
    self._transform_annotations(dataset_dict, transforms, image_shape)
  File "/content/detectron2/detectron2/data/dataset_mapper.py", line 131, in _transform_annotations
    instances = utils.annotations_to_instances(
  File "/content/detectron2/detectron2/data/detection_utils.py", line 418, in annotations_to_instances
    raise ValueError(
ValueError: Failed to use mask_format=='polygon' from the given annotations!

Environment:

-------------------------------  -----------------------------------------------------------------
sys.platform                     linux
Python                           3.10.12 (main, Jun  7 2023, 12:45:35) [GCC 9.4.0]
numpy                            1.22.4
detectron2                       0.6 @/content/detectron2/detectron2
detectron2._C                    not built correctly: No module named 'detectron2._C'
Compiler ($CXX)                  c++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
CUDA compiler                    Build cuda_11.8.r11.8/compiler.31833905_0
DETECTRON2_ENV_MODULE            <not set>
PyTorch                          2.0.1+cu118 @/usr/local/lib/python3.10/dist-packages/torch
PyTorch debug build              False
torch._C._GLIBCXX_USE_CXX11_ABI  False
GPU available                    Yes
GPU 0                            Tesla T4 (arch=7.5)
Driver version                   525.85.12
CUDA_HOME                        /usr/local/cuda
Pillow                           8.4.0
torchvision                      0.15.2+cu118 @/usr/local/lib/python3.10/dist-packages/torchvision
torchvision arch flags           3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6
fvcore                           0.1.5.post20221221
iopath                           0.1.9
cv2                              4.7.0
-------------------------------  -----------------------------------------------------------------
PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.8
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.7
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

alecGraves commented 10 months ago

Similarly, I am unable to load CVAT-exported "COCO 1.0"-formatted datasets with detectron2's COCO data loader. There is no indication of why the annotations are unusable; the images are just dropped without explanation. The data looks fine when loading it manually using pycocotools.

[12/05 01:13:47 d2.data.datasets.coco]: Loaded 100 images in COCO format from dataset/annotations/instances_default.json [12/05 01:13:47 d2.data.build]: Removed 100 images with no usable annotations. 0 images left.

Pragalbhv commented 9 months ago

I am facing the same issue; has there been a resolution yet?

annaszczuka commented 4 months ago

Same here

14790897 commented 2 months ago

same

faileon commented 1 month ago

I am having the same issue, anyone figured out what is wrong with COCO annotations exported from CVAT?

facebookresearch / detectron2