open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.5k stars 9.44k forks source link

Visualization of COCO panoptic segmentation with browse_dataset.py creates pixel off-set in overlay segmentation masks #11490

Open FranklynJey opened 8 months ago

FranklynJey commented 8 months ago

Describe the bug

When visualizing coco panoptic dataset with _browsedataset.py the overlay results include errors. To be more specific: While the RGB image is displayed correctly, the segmentation masks have a horizontal pixel offset. In some images it becomes more severe than in others.

We also tested it with a custom dataset and had the impression that the error becomes more severe when high-resolution imagea are used. For reproductive purpose, we created the examples using plain coco panoptic and used unmodified openmmlab-scripts.

Reproduction

Projection using browse_dataset.py We run browse_dataset.py using the coco_panoptic.py config file. Both scripts were unchanged, except the path to the dataset. We set the output-dir flag. Using the default settings, both examples can be found within the first 30 rendered results. The unique image_ids are 293802 and 483108.

The command

python tools/analysis_tools/browse_dataset.py configs/_base_/datasets/coco_panoptic.py --output-dir /tmp/coco
gives the following results: 293802 483108
000000293802 000000483108

For both examples, the horizontal pixel offsets between different segmentation masks are visible as white regions, e.g. white region between legs of skater. These errors are not part of the actual dataset. In order to show it we use the visualization script from the coco repo.

Reference projection using original coco scripts For the reference projection, we used the original coco panoptic visualization script. See panopticapi repo: visualization.py (https://github.com/cocodataset/panopticapi/blob/master/visualization.py) To directly access the two examples, you can add the following lines instead of a random picked example

img_id1 = 293802
img_id2 = 483108
ann1 = None
ann2 = None
for curr_ann in coco_d['annotations']:
    if curr_ann['image_id'] == img_id1:
        ann1 = curr_ann
    if curr_ann['image_id'] == img_id2:
        ann2 = curr_ann

The rendering using the original coco visualizer shows that the dataset itself does not have the segmentation artifacts: coco_panoptic_visu (All samples are taken from and belong to the coco dataset https://cocodataset.org/#home)

Further Questions

  1. Did you make any modifications on the code or config? Did you understand what you have modified?
    No changes

  2. What dataset did you use?
    coco panoptic, to be more specific panoptic_train2017.json

Environment

Please run python mmdet/utils/collect_env.py to collect necessary environment information and paste it here.

sys.platform: linux
Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
CUDA available: False
numpy_random_seed: 2147483648
GCC: gcc (Ubuntu 10.5.0-1ubuntu1~20.04) 10.5.0
PyTorch: 2.1.2
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.2, USE_CUDA=0, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

TorchVision: 0.16.2
OpenCV: 4.9.0
MMEngine: 0.10.2
MMDetection: 3.3.0+22f90df
FranklynJey commented 7 months ago

@jbwang1997 do you have a guess what causes the projection error. It might also be interesting if it is a general error which also appears during the training process or if it solely appears for browse_dataset. I might look into that the next days :)