Open sh8 opened 9 months ago
Hey @sh8,
thanks a lot for raising this issue. I just did a quick test on the function themselves, and they yield the expected result (bop_toolkit and BlenderProc functions are equivalent):
Python 3.9.18 | packaged by conda-forge | (main, Aug 30 2023, 03:49:32)
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from bop_toolkit_lib import pycoco_utils
>>> import numpy as np
>>> test = np.eye(10)
>>> test2 = test.copy()
>>> test2[0,0] = 0
>>> test_rle = pycoco_utils.binary_mask_to_rle(test)
{'counts': [0, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1], 'size': [10, 10]}
>>> test2_rle = pycoco_utils.binary_mask_to_rle(test2)
{'counts': [11, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1], 'size': [10, 10]}
>>> test_bin = pycoco_utils.rle_to_binary_mask(test_rle)
array([[ True, False, False, False, False, False, False, False, False,
False],
[False, True, False, False, False, False, False, False, False,
False],
[False, False, True, False, False, False, False, False, False,
False],
[False, False, False, True, False, False, False, False, False,
False],
[False, False, False, False, True, False, False, False, False,
False],
[False, False, False, False, False, True, False, False, False,
False],
[False, False, False, False, False, False, True, False, False,
False],
[False, False, False, False, False, False, False, True, False,
False],
[False, False, False, False, False, False, False, False, True,
False],
[False, False, False, False, False, False, False, False, False,
True]])
>>> test2_bin = pycoco_utils.rle_to_binary_mask(test2_rle)
array([[False, False, False, False, False, False, False, False, False,
False],
[False, True, False, False, False, False, False, False, False,
False],
[False, False, True, False, False, False, False, False, False,
False],
[False, False, False, True, False, False, False, False, False,
False],
[False, False, False, False, True, False, False, False, False,
False],
[False, False, False, False, False, True, False, False, False,
False],
[False, False, False, False, False, False, True, False, False,
False],
[False, False, False, False, False, False, False, True, False,
False],
[False, False, False, False, False, False, False, False, True,
False],
[False, False, False, False, False, False, False, False, False,
True]])
This is correct, so I am wondering if/where it went wrong for MegaPose-GSO (I haven't checked the data). There doesn't seem to be an update on the logic in the recent past.
@ylabbe Do you know which functions / versions you used to produce the RLE encoding in the MegaPose-GSO dataset? Could you check whether they are correct?
Hi, I think I found the same problem on some instances of MegaPose-Shapenet. Here are the RGB images obtained by masking the instance 000118_000005 with the mask of the 9th object:
Instead by inverting the mask the result is as expected:
Edit: I noticed that in this case the object bounding box is still correct. In order to detect objects with flipped mask, one could simply compute the bounding box from the mask: if this is different from the ground truth one, than the mask is flipped or wrong.
@jcorsetti Yes, I did the same thing to detect failure cases. This is a bit ugly but I leave this patched function just for reference.
def rle_to_binary_mask(rle, bbox_visib=None):
"""Converts a COCOs run-length encoding (RLE) to binary mask.
:param rle: Mask in RLE format
:return: a 2D binary numpy array where '1's represent the object
"""
binary_array = np.zeros(np.prod(rle.get('size')), dtype=bool)
counts = rle.get('counts')
start = 0
if bbox_visib is not None and len(counts) % 2 == 0 and bbox_visib[0] == 0 and bbox_visib[1] == 0:
counts.insert(0, 0)
for i in range(len(counts)-1):
start += counts[i]
end = start + counts[i+1]
binary_array[start:end] = (i + 1) % 2
binary_mask = binary_array.reshape(*rle.get('size'), order='F')
return binary_mask
I found RLE instance masks were flipped for a truncated object on the left-top corner of an image (e.g., 000002_000012, 011864_000019). You can reproduce this easily just by using an RLE decoding function in BlenderProc for these frames. I think the quick fix could be detecting the object and appending 0 to the beginning of RLE to flip the mask.
I see these flipped masks in almost 8-10% of all the frames. Could you confirm and fix it when you have time?
Thank you!
Flipped mask (011864_000019)![011864_000019_flip](https://github.com/thodan/bop_toolkit/assets/7430471/cda87d7d-3445-4ea8-be4b-a6ae79095a2c)
Correct mask (011864_000019)![011864_000019](https://github.com/thodan/bop_toolkit/assets/7430471/27e56474-40c0-4926-b90a-15f12b320e73)