cocodataset / cocoapi

COCO API - Dataset @ http://cocodataset.org/
Other
6.07k stars 3.75k forks source link

RLE encoding in python #492

Open SpamLee opened 3 years ago

SpamLee commented 3 years ago

Hey

I am struggling to wrap my head around the RLE encoding in the context of python

For example, below is an RLE object taken from the output of a machine learning model I am running.

rle = {'size': [1520, 2704], 'counts': b'ea_d21__11O0001O00000001O01O00a^40_aK000[k]30eTbL010O0000001O000000000010O00000001O01O00000001O01O0000010O00000001O000001O0001O0000010O0000000000010O0000000000010O0000000010O000000000010O00000001O01OO10001O000000P]jP1'}

I understand completely what counts is meant to represent, the number of digits in runs of 0's or 1's.

My main issue is when I read the byte string as bytes the output does not make sense to me.

bytes(rle["counts"], "utf-8") => [101, 97, 95, 100, 50, 49, 95, 95, 49, 49, 79, etc...

I can assure you that there is no chance that there is 101 background pixels on the first line with the image size being 1520*2704

I think there is a fundamental misunderstanding on how I am meant to read the outputs of the RLE encoding code provided in the pycocotools module.

Thank you very much for any insights you can provide

Kind Regards

caijinyue commented 3 years ago

I also have the same problem

jefequien commented 3 years ago

They use a variable length encoding to encode arbitrarily large numbers. /* Similar to LEB128 but using 6 bits/char and ascii chars 48-111. */

https://github.com/cocodataset/cocoapi/blob/master/common/maskApi.c#L205 https://en.wikipedia.org/wiki/LEB128

matthewchung74 commented 3 years ago

@jefequien thanks for that posting, my c is awful, do you know if that code is ported to python somewhere?

caijinyue commented 3 years ago

rle = coco.maskUtils.encode(single_mask)

rle['counts'] = rle['counts'].decode('ascii')

I solved the problem in this way. I also checked the segmentation annotations by visualization api of detectron2 and it is correct.

AnaRhisT94 commented 3 years ago

rle = coco.maskUtils.encode(single_mask)

rle['counts'] = rle['counts'].decode('ascii')

I solved the problem in this way. I also checked the segmentation annotations by visualization api of detectron2 and it is correct.

I got a 2d numpy array with zeroes and ones, I guess it's a binary mask. can I visualize it with detectron2?

KJ-Waller commented 7 months ago

Sorry for digging up this old issue, but I'm having similar issues. I also have a COCO formatted RLE byte string

{'size': [3200, 4800],
 'counts': b'Sl]_61mR3^1[O`0C:F;E<B?A`0@`0A?A<D<D;E<D;D;E<F:F<E9J6L4K4M4K4M3L4M2M3N2N2M2O2M3N1O2N2M3N2N2N2M3N2N3M3L4M2N3M2N2M3N3M3M3M3M4L4L3M3M2N3L2O2N2N1O1O1O2N2N3M4L4L4L3L4M4L3M3M2N4L4L3M4L4L3M2N3M2N2N2N3M2N3M3M4L4L5K5K5K3M3M3M3M4L4L5K5K4L4L3N3L3M3M4L3M3M4L5K4L4L4L3M4M1N3M2N2N1O1O2N100O2N2N1O2N2N2O0O2N2N2N1O2O1N2N2N2N2O1N2N2N2O0O2N2N101N2N101N1O101N1O2O1N2N2N2O1N2N2N101N2O1O001O10O010001O00000O10000O10000O010O10O10O002N1O2N1O2N1O2N1O1O1O2N1O1N2O2L4L4L4M3L4L4L3M4M2M3M3N3L4L4M3L4M3M2N2N2O1N3M2N2N3M2N2N2N3M2N2N2N2M2N3M3M2N3M2N2N3M2N3M3M2N3M3M3M3N000000001O0O10000000001O0000000O2N1O2N1O2N2N1O2N1O2N2N1O2N2N2O0O1000O001O1O100O1O001O1O1UAmgMh1TX2SNRhMl1nW2oMYhMn1gW2oM_hMn1bW2mMehMP2]W2hMkhMU2YW2SM_iMj2cV2gLmiMU3VV2fLoiMW3TV2eLQjMX3QV2cLUjMY3nU2bLWjM\\3kU2`LZjM\\3iU2_L]jM^3fU2\\L`jMb3aU2YLejMd3]U2]I^gMjJo3f;fT2XIhmMe6ZR2WImmMd6UR2XIRnMc6QR2XIWnMc6iQ2\\I^nM\\6eQ2cIbnMU6_Q2jIhnMo5YQ2PJmnMj5UQ2UJonMf5RQ2YJRoMb5PQ2^JToM]5nP2aJVoM[5kP2dJZoMV5hP2jJ[oMR5fP2mJ^oMo4dP2PK`oMj4bP2UKnoM[4SP2eKaPNe3bo1YLbPNb3`o1^LdPN]3]o1bLiPNW3Xo1iLoPNo2Ro1RMUQNe2ln1[M\\QN\\2fn1cMeQNR2[n1nMoQNg1Rn1ZNXRNZ1jm1eN\\RNT1em1lN`RNn0am1SObRNh0_m1XOeRNc0\\m1]OhRN=[m1CgRN9Zm1GjRN4Wm1LlRNOVm11nRNJTm16nRNEUm19nRNDUm18PSNEQm18TSNDol18XSNBjl1:^SNAcl1<cSN_O`l1<hSN^OZl1>nSN]OTl1=UTN]Omk1a0YTNYOjk1e0[TNVOfk1i0_TNQOck1o0bTNjNak1T1dTNgN]k1X1hTNbN[k1]1iTN]NXk1d1PUNPNTk1P2^c01N10001N101O0O2O001N3N1O2M3OL3L5M3N1O2N2N1O2N2N2N1O2N2N2N1O3M4K9G7I6J6J5K5K4L4^Ob0^O]gYl6'}

Running rle['counts'].decode('ascii') just gets me the same sequence but as a string:

'Sl]_61mR3^1[O`0C:F;E<B?A`0@`0A?A<D<D;E<D;D;E<F:F<E9J6L4K4M4K4M3L4M2M3N2N2M2O2M3N1O2N2M3N2N2N2M3N2N3M3L4M2N3M2N2M3N3M3M3M3M4L4L3M3M2N3L2O2N2N1O1O1O2N2N3M4L4L4L3L4M4L3M3M2N4L4L3M4L4L3M2N3M2N2N2N3M2N3M3M4L4L5K5K5K3M3M3M3M4L4L5K5K4L4L3N3L3M3M4L3M3M4L5K4L4L4L3M4M1N3M2N2N1O1O2N100O2N2N1O2N2N2O0O2N2N2N1O2O1N2N2N2N2O1N2N2N2O0O2N2N101N2N101N1O101N1O2O1N2N2N2O1N2N2N101N2O1O001O10O010001O00000O10000O10000O010O10O10O002N1O2N1O2N1O2N1O1O1O2N1O1N2O2L4L4L4M3L4L4L3M4M2M3M3N3L4L4M3L4M3M2N2N2O1N3M2N2N3M2N2N2N3M2N2N2N2M2N3M3M2N3M2N2N3M2N3M3M2N3M3M3M3N000000001O0O10000000001O0000000O2N1O2N1O2N2N1O2N1O2N2N1O2N2N2O0O1000O001O1O100O1O001O1O1UAmgMh1TX2SNRhMl1nW2oMYhMn1gW2oM_hMn1bW2mMehMP2]W2hMkhMU2YW2SM_iMj2cV2gLmiMU3VV2fLoiMW3TV2eLQjMX3QV2cLUjMY3nU2bLWjM\\3kU2`LZjM\\3iU2_L]jM^3fU2\\L`jMb3aU2YLejMd3]U2]I^gMjJo3f;fT2XIhmMe6ZR2WImmMd6UR2XIRnMc6QR2XIWnMc6iQ2\\I^nM\\6eQ2cIbnMU6_Q2jIhnMo5YQ2PJmnMj5UQ2UJonMf5RQ2YJRoMb5PQ2^JToM]5nP2aJVoM[5kP2dJZoMV5hP2jJ[oMR5fP2mJ^oMo4dP2PK`oMj4bP2UKnoM[4SP2eKaPNe3bo1YLbPNb3`o1^LdPN]3]o1bLiPNW3Xo1iLoPNo2Ro1RMUQNe2ln1[M\\QN\\2fn1cMeQNR2[n1nMoQNg1Rn1ZNXRNZ1jm1eN\\RNT1em1lN`RNn0am1SObRNh0_m1XOeRNc0\\m1]OhRN=[m1CgRN9Zm1GjRN4Wm1LlRNOVm11nRNJTm16nRNEUm19nRNDUm18PSNEQm18TSNDol18XSNBjl1:^SNAcl1<cSN_O`l1<hSN^OZl1>nSN]OTl1=UTN]Omk1a0YTNYOjk1e0[TNVOfk1i0_TNQOck1o0bTNjNak1T1dTNgN]k1X1hTNbN[k1]1iTN]NXk1d1PUNPNTk1P2^c01N10001N101O0O2O001N3N1O2M3OL3L5M3N1O2N2N1O2N2N2N1O2N2N2N1O3M4K9G7I6J6J5K5K4L4^Ob0^O]gYl6'

The only way I can convert it to a list of integers is by doing list(rle['counts']), but like @SpamLee, I get an output that doesn't make sense

[83, 108, 93, 95, 54, ... ]

I also know for a fact that the first mask doesn't start at 83. What am I doing wrong?

Also, when I run bytes(rle["counts"], "utf-8") I do not get a list of integers, but an error:

----> 1 bytes(rle["counts"], "utf-8")

TypeError: encoding without a string argument