thodan / bop_toolkit

A Python toolkit of the BOP benchmark for 6D object pose estimation.
http://bop.felk.cvut.cz
MIT License
376 stars 135 forks source link

MegaPose-GSO masks are flipped for an object on the left-top corner #103

Open sh8 opened 9 months ago

sh8 commented 9 months ago

I found RLE instance masks were flipped for a truncated object on the left-top corner of an image (e.g., 000002_000012, 011864_000019). You can reproduce this easily just by using an RLE decoding function in BlenderProc for these frames. I think the quick fix could be detecting the object and appending 0 to the beginning of RLE to flip the mask.

I see these flipped masks in almost 8-10% of all the frames. Could you confirm and fix it when you have time?

Thank you!

Flipped mask (011864_000019) 011864_000019_flip

Correct mask (011864_000019) 011864_000019

MartinSmeyer commented 9 months ago

Hey @sh8,

thanks a lot for raising this issue. I just did a quick test on the function themselves, and they yield the expected result (bop_toolkit and BlenderProc functions are equivalent):

Python 3.9.18 | packaged by conda-forge | (main, Aug 30 2023, 03:49:32) 
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from bop_toolkit_lib import pycoco_utils
>>> import numpy as np
>>> test = np.eye(10)
>>> test2 = test.copy()
>>> test2[0,0] = 0
>>> test_rle = pycoco_utils.binary_mask_to_rle(test)
{'counts': [0, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1], 'size': [10, 10]}
>>> test2_rle = pycoco_utils.binary_mask_to_rle(test2)
{'counts': [11, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1], 'size': [10, 10]}
>>> test_bin = pycoco_utils.rle_to_binary_mask(test_rle)
array([[ True, False, False, False, False, False, False, False, False,
        False],
       [False,  True, False, False, False, False, False, False, False,
        False],
       [False, False,  True, False, False, False, False, False, False,
        False],
       [False, False, False,  True, False, False, False, False, False,
        False],
       [False, False, False, False,  True, False, False, False, False,
        False],
       [False, False, False, False, False,  True, False, False, False,
        False],
       [False, False, False, False, False, False,  True, False, False,
        False],
       [False, False, False, False, False, False, False,  True, False,
        False],
       [False, False, False, False, False, False, False, False,  True,
        False],
       [False, False, False, False, False, False, False, False, False,
         True]])
>>> test2_bin = pycoco_utils.rle_to_binary_mask(test2_rle)
array([[False, False, False, False, False, False, False, False, False,
        False],
       [False,  True, False, False, False, False, False, False, False,
        False],
       [False, False,  True, False, False, False, False, False, False,
        False],
       [False, False, False,  True, False, False, False, False, False,
        False],
       [False, False, False, False,  True, False, False, False, False,
        False],
       [False, False, False, False, False,  True, False, False, False,
        False],
       [False, False, False, False, False, False,  True, False, False,
        False],
       [False, False, False, False, False, False, False,  True, False,
        False],
       [False, False, False, False, False, False, False, False,  True,
        False],
       [False, False, False, False, False, False, False, False, False,
         True]])

This is correct, so I am wondering if/where it went wrong for MegaPose-GSO (I haven't checked the data). There doesn't seem to be an update on the logic in the recent past.

@ylabbe Do you know which functions / versions you used to produce the RLE encoding in the MegaPose-GSO dataset? Could you check whether they are correct?

jcorsetti commented 5 months ago

Hi, I think I found the same problem on some instances of MegaPose-Shapenet. Here are the RGB images obtained by masking the instance 000118_000005 with the mask of the 9th object:

standard

Instead by inverting the mask the result is as expected:

inv

Edit: I noticed that in this case the object bounding box is still correct. In order to detect objects with flipped mask, one could simply compute the bounding box from the mask: if this is different from the ground truth one, than the mask is flipped or wrong.

sh8 commented 4 months ago

@jcorsetti Yes, I did the same thing to detect failure cases. This is a bit ugly but I leave this patched function just for reference.


def rle_to_binary_mask(rle, bbox_visib=None):
    """Converts a COCOs run-length encoding (RLE) to binary mask.

    :param rle: Mask in RLE format
    :return: a 2D binary numpy array where '1's represent the object
    """
    binary_array = np.zeros(np.prod(rle.get('size')), dtype=bool)
    counts = rle.get('counts')

    start = 0

    if bbox_visib is not None and len(counts) % 2 == 0 and bbox_visib[0] == 0 and bbox_visib[1] == 0:
        counts.insert(0, 0)

    for i in range(len(counts)-1):
        start += counts[i] 
        end = start + counts[i+1] 
        binary_array[start:end] = (i + 1) % 2

    binary_mask = binary_array.reshape(*rle.get('size'), order='F')

    return binary_mask