Closed beyse closed 2 years ago
According this, bpycv will encode inst id represent as [0.~1.] float32 RGB value, that blender can read as color to render output, code here:
https://github.com/DIYer22/bpycv/blob/074f49b6494c9784a12067b3174e93f0f52dddc8/bpycv/utils.py#L13
We choice a encode solution that provide inst_id more thant 1e6 and could distinguish instance on the float32 rgb
Hi,
thanks for the reply.
I have carefully read this but I could not find any evidence that
bpycv will encode inst id represent as [0.~1.] float32 RGB value, that blender can read as color to render output.
Maybe I am missing something? Perhaps you can point me the exact paragraph or sentence that you mean. In any case, it is not disputed that blender uses floating point numbers to represent RGB values.
I have also read
https://github.com/DIYer22/bpycv/blob/074f49b6494c9784a12067b3174e93f0f52dddc8/bpycv/utils.py#L13
which clearly shows the inst_id is mapped to floating point RGB values and that is perfectly in line with what I wrote in my initial comment:
I could use
cv2.imwrite('/out/put/image.tiff', np.float32(result["inst"]))
to save a 32 bit floating point image, which obviously gives me much more possibilities when it comes to the number of categories and instances that can be annotated.
When you write
We choice a encode solution that provide inst_id more thant 1e6 and could distinguish instance on the float32 rgb
I think that perfectly fits my observation that the inst_id can be bigger than 1e6. However I noticed following assertion in the code:
This assertion checks whether the inst_id is smaller (or equal to) than 100e4 (which is 1e6) and creates a runtime error if it is bigger than 1e6.
I was not able find an explanation in your recent comment, so I would like to repeat my question:
Why must the
inst_id
be <= 100e4 here ?
Thank you for the picture with the two green cubes, here is one with a blue sphere 😄
Why must the inst_id be <= 100e4 here ?
If inst_id is too big, the encode/decode solution of bpycv can not accurately recover inst id from RGB value
For example:
instid =1100000;
recover = encode_inst_id.rgb_to_id(encode_inst_id.id_to_rgb(instid))
# recover will be 274999
Alright, that is a reason.
I was interested in finding the first number which does not work and wanted to find out why, so I wrote this test code:
def convert_and_check(original_id):
try:
logger.info(f'input = {original_id}')
rgb = encode_inst_id.id_to_rgb(original_id)
logger.debug(f'rgb = {rgb}')
converted_id = encode_inst_id.rgb_to_id(rgb)
if original_id != converted_id:
logger.error(f'Failure: {original_id} != {converted_id}')
else:
logger.success(f'{original_id} works fine.')
except BaseException as ex:
logger.error(f'Runtime Error when using {original_id}: {ex}')
and I run it like:
for i in range(1048574, 1048576):
convert_and_check(i)
and I get:
2022-06-17 23:54:42.300 | DEBUG | __main__:convert_and_check:79 - rgb = [0. 0.99999905 0. ]
2022-06-17 23:54:42.301 | SUCCESS | __main__:convert_and_check:85 - 1048574 works fine.
2022-06-17 23:54:42.307 | DEBUG | __main__:convert_and_check:79 - rgb = [0.00000000e+00 4.76837158e-07 0.00000000e+00]
RuntimeWarning: divide by zero encountered in long_scalars
2022-06-17 23:54:42.315 | DEBUG | __main__:convert_and_check:79 - rgb = [0.00000000e+00 1.43051147e-06 0.00000000e+00]
2022-06-17 23:54:42.316 | ERROR | __main__:convert_and_check:83 - Failure: 1048576 != 262143
Summing it up in a more readable way:
Number | Outcome |
---|---|
1048574 | works fine ✅ |
1048575 | divide by zero error 💥 |
1048576 | can't recover ❌ |
So the first number which does not work just so happens to be 2^20 - 1. The divide by zero
problem happens in this line and comes from the fact that max_denominator is set to 2^20.
So I was wondering what happens if I set max_depth to 21 insead of 20. Sure enough, this time the numbers I test work fine:
2022-06-18 00:10:36.940 | SUCCESS | __main__:convert_and_check:86 - 1048573 works fine.
2022-06-18 00:10:36.946 | DEBUG | __main__:convert_and_check:80 - rgb = [0. 0.99999905 0. ]
2022-06-18 00:10:36.948 | SUCCESS | __main__:convert_and_check:86 - 1048574 works fine.
2022-06-18 00:10:36.954 | DEBUG | __main__:convert_and_check:80 - rgb = [0.00000000e+00 4.76837158e-07 0.00000000e+00]
2022-06-18 00:10:36.957 | SUCCESS | __main__:convert_and_check:86 - 1048575 works fine.
2022-06-18 00:10:36.962 | DEBUG | __main__:convert_and_check:80 - rgb = [0.00000000e+00 1.43051147e-06 0.00000000e+00]
2022-06-18 00:10:36.964 | SUCCESS | __main__:convert_and_check:86 - 1048576 works fine.
2022-06-18 00:10:36.970 | DEBUG | __main__:convert_and_check:80 - rgb = [0.00000000e+00 2.38418579e-06 0.00000000e+00]
2022-06-18 00:10:36.972 | SUCCESS | __main__:convert_and_check:86 - 1048577 works fine.
2022-06-18 00:10:36.978 | DEBUG | __main__:convert_and_check:80 - rgb = [0.00000000e+00 3.33786011e-06 0.00000000e+00]
2022-06-18 00:10:36.979 | SUCCESS | __main__:convert_and_check:86 - 1048578 works fine.
2022-06-18 00:10:36.985 | DEBUG | __main__:convert_and_check:80 - rgb = [0.00000000e+00 4.29153442e-06 0.00000000e+00]
2022-06-18 00:10:36.987 | SUCCESS | __main__:convert_and_check:86 - 1048579 works fine.
So now I am wondering:
What is the reason that max_depth
is set to 20?
I get that mapping integer numbers to floating point numbers in range [0, 1] is not trivial. But given that there are 1,056,964,608 distinct single-precision floating point numbers between 0 and 1, I do not see a reason why it here has to stop at 2^20 (1 048 576) numbers. Is it a limitation of blender?
Let me know if I have missed something.
Is it a limitation of blender?
No, Blender is OK
I just choice one encode solution that :
1.2
, 3.14
Those two cubes's inst_id are 2
, 3
respectively, and has different encoded RGB color(3 float32).
I see, that seems to make a lot of sense. Thank you for the explanation.
Hello there 👋
As described here in #38 one should specify the instance id as follows:
object["inst_id"] = categories_id * 1000 + index
. The following example demonstrates thatuint16
should be used to store the instance segmentation information in an image:https://github.com/DIYer22/bpycv/blob/074f49b6494c9784a12067b3174e93f0f52dddc8/example/demo.py#L44
I am assuming this is done to comply with Cityscape dataset, which is perfectly fine.
Since
2^16 = 65536
and the instance id iscategories_id * 1000 + index
we can not have more than 65 categories (classes) and can not count more than roughly 500 instances per class with this approach if my math is right.But I would like to go way beyond this and I have noticed that 32 bit integer are used internally and I could use
to save a 32 bit floating point image, which obviously gives me much more possibilities when it comes to the number of categories and instances that can be annotated.
However I noticed following assertion in the code:
https://github.com/DIYer22/bpycv/blob/074f49b6494c9784a12067b3174e93f0f52dddc8/bpycv/pose_utils.py#L147
which again limits the instance id, but I could not find an explanation for what is finally my question:
Why must the
inst_id
be <= 100e4 here ?Many thanks in advance 😺